That's not really a fair test because you're leading the model pretty hard, even if the pr...

gruez • yesterday at 4:31 PM • 1 reply • view on HN

That's not really a fair test because you're leading the model pretty hard, even if the prompt doesn't specifically say there's a bug to be found. It's basically the same objections that people raised in the thread where someone claimed current models are just as good as mythos.

Replies

shay_ker • yesterday at 4:59 PM

right exactly, but clearly it's possible to elicit the behavior we want in the model, which means the capabilities are there!

➕ show 1 reply

alt Hacker News

Replies