logoalt Hacker News

measurablefuncyesterday at 10:32 PM0 repliesview on HN

I read the paper. All the prerequisites are already available in existing literature & they basically profiled & optimized around the bottlenecks to avoid pipeline stalls w/ instructions that utilize the available tensor & CUDA cores. Seems like something these super duper AIs that don't get tired should be able to do pretty easily.