logoalt Hacker News

Y_Ylast Monday at 5:44 PM2 repliesview on HN

You raise a really interesting point. I'm sure it's just missed my notice, but I'm not familiar with any projects from antediluvian AI that have been resurrected to run on modern hardware and see where they'd really asymptote if they'd had the compute they deserved.


Replies

rsfernyesterday at 3:50 AM

This paper “were RNNs all we needed?” explores this hypothesis a bit, finding that some pre-transformer sequence models can match transformers when trained at appropriate scale. Though they did have to make some modifications to unlock more parallelism

https://arxiv.org/abs/2410.01201

FeepingCreaturelast Monday at 6:21 PM

To be fair, usually those projects would need considerable work to be ported to modern multicore machines, let alone GPUs.

show 1 reply