now i wonder if you can compare vs feeding into a GPT style transformer of a similar Order of Magnitude in param count..
That's the question today. Turns out transformers really are a leap forwards in terms of AI, whereas Markov chains, scaled up to today's level of resources and capacity, will still output gibberish.
I thought for a moment your comment was the output of a Markov chain trained on HN