That is disappointing. One would say that with all the budget and compute, Google would be able to create something that beats methods from 70s. Maybe we are hitting some hard limits.
Maybe it would be better to train an LLM with various tuning methodologies and make a dedicated ARIMA agent. You throw in data, some metadata and requested window of forecast. Out comes parameters for "optimal" conventional model.
I think this could be an interesting read for you, I read it last week and it kind of argues the same points: https://shakoist.substack.com/p/against-time-series-foundati...