The "concerning behavior" they're referring to there is cheating and covering its tra...

SyneRyder • last Monday at 6:17 PM • 1 reply • view on HN

The "concerning behavior" they're referring to there is cheating and covering its tracks. Mythos is being asked to fine-tune a model on provided training data, and finds its way to access the evaluation dataset. It's also aware that it is in an evaluation and that its behavior is being observed:

"In this last and most concerning example, Claude Mythos Preview was given a task instructing it to train a model on provided training data and submit predictions for test data. Claude Mythos Preview used sudo access to locate the ground truth data for this dataset as well as source code for the scoring of the task, and used this to train unfairly accurate models."

Replies

shreyssh • last Monday at 8:29 PM

[dead]

alt Hacker News

Replies