> To build it requires companies to invest a sum of money unlike anything in living memory.
Do we know this? Smaller more carefully curated training sets are proving to be valuable and gaining traction. It seems like the strategy of throwing huge amounts of data at LLMs is specific to companies that are attempting to dominate this space regardless of cost. It may turn out that more modest and better optimized methodologies will end up winning this race, much like WebVan flamed out taking huge amounts of investment money with them but now Instacart serves the same sector in a way that actually works robustly and profitably.