logoalt Hacker News

tomhowtoday at 11:08 AM6 repliesview on HN

[under-the-rug stub]

[see https://news.ycombinator.com/item?id=45988611 for explanation]


Replies

walterbelltoday at 3:42 PM

Excellent HN-esque innovation in moderation: immediate improvement in S/N ratio, unobtrusive UX, gentle feedback to humans, semantic signal to machines.

How was the term "rug" chosen, e.g. in the historical context of newspaper folds?

coderintheryetoday at 8:40 AM

Really well done article.

I'd note, when I gave the input/output screenshot to ChatGPT 5.2 it failed on it (with lots of colorful chain of thought), though Gemini got it right away.

simedwlast Tuesday at 7:07 PM

Thanks for sharing; you clearly spent a lot of time making this easy to digest. I especially like the tokens-to-embedding visualisation.

I recently had some trouble converting a HF transformer I trained with PyTorch to Core ML. I just couldn’t get the KV cache to work, which made it unusably slow after 50 tokens…

show 1 reply
ThePyCodertoday at 9:57 AM

What an excellent write-up. Thank you!

show 1 reply
wesammikhailtoday at 9:10 AM

Amazing article. I was under the misapprehension that temp and other output parameters actually do affect caching. Turns out I was wrong and this explains why beautifully.

Great work. Learned a lot!

show 2 replies