logoalt Hacker News

leonidasruptoday at 6:00 AM1 replyview on HN

For example, you can do loop unrolling using C++ template meta-programming.

https://cpplove.blogspot.com/2012/07/a-generic-loop-unroller...

Of course, nothing beats hand written ffmpeg-style assembly which takes into account optimal register allocation, instruction scheduling, cache alignment, etc. for specific processor architectures.


Replies

jeffreygoestotoday at 6:57 AM

Careful. That article is from 2012 and compile time unrolling was more useful back then. Today or can actually be harmful as it hides strong hints about the loop from the optimizer. Our code that did this fared worse than a loop, because no optimizer-writer expected unrolled loops.