GPUs are still software programmable.
An "LLM chip" does not need that and so can be much more efficient.