Modal's GPU glossary is a good overview about how GPUs work [0]. Karpathy's LLM overview is a good high level overview on LLMs [1]. 3b1b's video (and subsequent videos) on transformers was excellent at helping me understand the math at a high level [2]. This matrix multiplication optimization worklog helped me understand writing better CUDA (not for beginner intro though) [3].
During this process I also asked ChatGPT a lot of questions.
I'm definitely open to suggestions about "how to learn" with all the new tools we have. I felt this has not been straightforward to figure out.
[0] https://modal.com/gpu-glossary
[1] https://www.youtube.com/watch?v=7xTGNNLPyMI