Jetson uses LPDDR though. H100 failures seem driven by HBM heat sensitivity and the 700W+ envelope. ...

dsrtslnd23 • yesterday at 11:14 PM • 1 reply • view on HN

Jetson uses LPDDR though. H100 failures seem driven by HBM heat sensitivity and the 700W+ envelope. That is a completely different thermal density I guess.

Replies

zozbot234 • yesterday at 11:48 PM

Reliability also depends strongly on current density and applied voltage, even more perhaps than on thermal density itself. So "slowing down" your average GPU use in a long-term sustainable way ought to improve those reliability figures via multiple mechanisms. Jetsons are great for very small-scale self-contained tasks (including on a performance-per-watt basis) but their limits are just as obvious, especially with the recently announced advances wrt. clustering the big server GPUs on a rack- and perhaps multi-rack level.

alt Hacker News

Replies