Why would I buy chips to run handicapped models when the 10+ llms players all offer free tier access to their 1t+ parameters models ?
Do you think the free gravy train will run forever?
Not all applications are chatbots. Many potential uses for LLMs/VLAMs are latency constrained.
Do you think the free gravy train will run forever?