logoalt Hacker News

ehntolast Wednesday at 3:57 AM1 replyview on HN

When talking about the capabilities of a class of tools long term, it makes sense to be general. I think deriving conclusions at all is pretty difficult given how fast everything is moving, but there is some realities we do actually know about how LLMs work and we can talk about that.

Knowing that ChatGPT output good tokens last tuesday but Sonnet didn't does not help us know much about the future of the tools on general.


Replies

dpoloncsaklast Thursday at 2:23 PM

> Knowing that ChatGPT output good tokens last tuesday but Sonnet didn't does not help us know much about the future of the tools on general.

Isnt that exactly what is going to help us understand the value these tools bring to end-users, and how to optimize these tools for better future use? None of these models are copy+pastes, they tend to be doing things slightly differently under the hood. How those differences affect results seems like the exact data we would want here

show 1 reply