logoalt Hacker News

Majromaxtoday at 1:37 PM2 repliesview on HN

Penny-wise and pound foolish. Non-ECC RAM might save on the small amount of RAM power, but if a bit-flip causes a failed computation then an entire forwards/backwards step – possibly involving several nodes – might need to be redone.


Replies

hylaridetoday at 3:58 PM

Linus Torvalds was recently on Linux Tech Tips to build a new computer and he insisted on ECC RAM. Torvalds was convinced that memory errors are a much greater problem for stability than otherwise posted and he's spent an inordinate amount of time chasing phantom bugs because of it.

https://www.youtube.com/watch?v=mfv0V1SxbNA

coldteatoday at 2:10 PM

>but if a bit-flip causes a failed computation then an entire forwards/backwards step – possibly involving several nodes – might need to be redone.

Which for the most part it would be an irrelevant cost-of-doing business compared to the huge savings from non-ECC and how incosequential it is if some ChatGPT computation fails...