logoalt Hacker News

alansabertoday at 9:00 AM0 repliesview on HN

This is a little bit too whimsical for me, but distributed model training across thousands of GPUs has the potential to introduce lots of little quirks that are impossible to exactly source