logoalt Hacker News

evilduckyesterday at 6:05 PM1 replyview on HN

To be fair to your field, that advancement seems expected, no? We can do things to LLMs that we can't ethically or practically do to humans.


Replies

AlphaAndOmega0yesterday at 9:31 PM

I'm still impressed by the progress in interpretability, I remember being quite pessimistic that we'd achieve even what we have today (and I recall that being the consensus in ML researchers at the time). In other words, while capabilities have advanced at about the pace I expected from the GPT-2/3 days, mechanistic interpretability has advanced even faster than I'd hoped for (in some ways, we are very far from completely understanding the ways LLMs work).