logoalt Hacker News

dannywtoday at 3:31 AM1 replyview on HN

If you’re doing evals, you’re basically doing RLAIF without training a model; just looking at the results.

Fundamentally it is very difficult to stop this while still making your AI models useful.


Replies

zmgsabsttoday at 6:26 AM

Similarly, if you did a corpus study on bioRvix to summarize recent science findings — you could use the same questions and answers to fine tune a model.

There is no way to communicate information at scale to companies through the API, for anything approaching a real application, without that information forming a corpus another model can be trained on.

But it wouldn’t be the first time they broke a model:

Their “guardrails” that cause it to reject user prompts also means it relies on its pop science summary of medicine to tell you why bioRxiv is wrong rather than accurately summarize the papers.

They’ve successfully created a smug, argumentative average of the internet which refuses to even consider it might be wrong or that it’s reading a science paper which is based on measurements and not vibes — but why would I pay for that?

I get it for free online.