Thanks for sharing; you clearly spent a lot of time making this easy to digest. I especially like th...

simedw • last Tuesday at 7:07 PM • 1 reply • view on HN

Thanks for sharing; you clearly spent a lot of time making this easy to digest. I especially like the tokens-to-embedding visualisation.

I recently had some trouble converting a HF transformer I trained with PyTorch to Core ML. I just couldn’t get the KV cache to work, which made it unusably slow after 50 tokens…

Replies

samwho • last Wednesday at 9:13 AM

Thank you so much <3

Yes, I recently wrote https://github.com/samwho/llmwalk and had a similar experience with cache vs no cache. It’s so impactful.

➕ show 1 reply

alt Hacker News

Replies