logoalt Hacker News

forintiyesterday at 10:38 PM1 replyview on HN

How does it compare to a Markov chain generator I wonder.


Replies

jll29today at 12:20 AM

The Transformer is the more powerful model than Markov chain, but on such a weak machine as the C64, a MC could output text faster - but it surely would sound "psychedelic", as the memory limits a MC to a first-order or second-order model, so to predict one word, only the two words before would be taken into account as context (and no attention).

On a plain vanilla C64, the Transformer cannot really show what it's capable of doing. An implementation using 2 bit per weight (vectorized) could be slightly better, perhaps.