This looks great, but I'd really like to see associated exercises (and solutions) to make it useful for self-study
"Modern [NVIDIA GPU] Programming for ..."
Everything after "Pipelining GEMM with TMA" (inclusive) is specific to NVIDIA. Which is fine but the title (of the guide itself) is clearly misleading.
This looks great, but I'd really like to see associated exercises (and solutions) to make it useful for self-study