logoalt Hacker News

ashleyntoday at 2:28 AM5 repliesview on HN

There are modern VLIW architectures. I think Groq uses one. The lessons on what works and what doesn't are worth learning from history.


Replies

bri3dtoday at 2:48 AM

VLIW works for workloads where the compiler can somewhat accurately predict what will be resident in cache. It’s used everywhere in DSP, was common in GPU for awhile, and is present in lots of niche accelerators. It’s a dead end for situations where cache residency is not predictable, like any kind of multitenant general purpose workload.

addaontoday at 2:33 AM

A more everyday example is the Hexagon DSP ISA in Qualcomm chips. Four-wide VLIW + SMT.

0dyltoday at 4:52 AM

The new TI C2000 F29 series of microcontrollers are VLIW

vardumptoday at 3:25 AM

I meant narrowly only about IA64. There is sure some lessons learned value.

mslatoday at 5:00 AM

IA64 was EPIC, which, itself, was a "lessons learned" VLIW design, in that it had things like stop bits to explicitly demarcate dependency boundaries so instructions from multiple words could be combined on future hardware with more parallelism, and speculative execution and loads, which, well, see the article on how the speculative loads were a mixed blessing.

https://en.wikipedia.org/wiki/Explicitly_parallel_instructio...