There are modern VLIW architectures. I think Groq uses one. The lessons on what works and what doesn...

ashleyn • today at 2:28 AM • 5 replies • view on HN

There are modern VLIW architectures. I think Groq uses one. The lessons on what works and what doesn't are worth learning from history.

Replies

bri3d • today at 2:48 AM

VLIW works for workloads where the compiler can somewhat accurately predict what will be resident in cache. It’s used everywhere in DSP, was common in GPU for awhile, and is present in lots of niche accelerators. It’s a dead end for situations where cache residency is not predictable, like any kind of multitenant general purpose workload.

addaon • today at 2:33 AM

A more everyday example is the Hexagon DSP ISA in Qualcomm chips. Four-wide VLIW + SMT.

0dyl • today at 4:52 AM

The new TI C2000 F29 series of microcontrollers are VLIW

vardump • today at 3:25 AM

I meant narrowly only about IA64. There is sure some lessons learned value.

msla • today at 5:00 AM

IA64 was EPIC, which, itself, was a "lessons learned" VLIW design, in that it had things like stop bits to explicitly demarcate dependency boundaries so instructions from multiple words could be combined on future hardware with more parallelism, and speculative execution and loads, which, well, see the article on how the speculative loads were a mixed blessing.

https://en.wikipedia.org/wiki/Explicitly_parallel_instructio...

alt Hacker News

Replies