logoalt Hacker News

pizlonatoryesterday at 5:26 PM0 repliesview on HN

> Rematerializing 'safe' computation from across a barrier or thread sync/wait works wonders.

While this is literally "rematerialization", it's such a different case of remat from what I'm talking about that it should be a different phase. It's optimizing for a different goal.

Also feels very GPU specific. So I'd imagine this being a pass you only add to the pipeline if you know you're targeting a GPU.

> Also loads and stores and function calls, but that's a bit finicky to tune. We usually tell people to update their programs when this is needed.

This also feels like it's gotta be GPU specific.

No chance that doing this on a CPU would be a speed-up unless it saved you reg pressure.