This is becoming a bit scary. I almost hope we'll reach some kind of plateau for llm intelligen...

awestroke • today at 6:55 PM • 5 replies • view on HN

This is becoming a bit scary. I almost hope we'll reach some kind of plateau for llm intelligence soon.

Replies

A plateau is unlikely, at least for cybersecurity. RL scales well here and is replicable outside of Anthropic (rewards are verifiable, so setting up the training environment doesn't require that much cleverness).

The post also points out that the model wasn't trained specifically on cybersecurity, and that it was just a side-effect – so I think there's still a lot of headroom.

It's scary, but there's also some room for cautious non-pessimism. More people than ever can cause billions of dollars of damage in attacks now [1], but the same tools can be used for defensive use. For that reason, I'm more optimistic about mitigations in security vs. other risk areas like biosecurity.

[1]: https://www.noahlebovic.com/testing-an-autonomous-hacker/

hibikir • today at 8:06 PM

On a topic like cybersecurity, we never win by not looking: One needs top of the line knowledge of how to break a system to be able to protect it. We have that dilemma dealing with human experts: The same government sponsored unit that tells you that you need to update your encryption can hold on to the information and use it to exploit it at their leisure.

Given that it's absolutely impossible to stop people not aligned with us (for any definition of us) from doing AI research, the most reasonable way forward is to dedicate compute resources to the frontier, and to automatically send reasonable disclosures to major projects. It could in itself be a pretty reasonable product. Just like you pay for dubious security scans and publish that you are making them, an LLM company could offer actually expensive security reviews with a preview model, and charge accordingly.

dist-epoch • today at 8:56 PM

The immediate plateau is the energy output of the Sun captured by the Dyson Swarm around it. Until there it's smooth sailing.

esafak • today at 7:53 PM

We need to promote alignment and other ethics benchmarks; we can't change what we don't measure. I don't even know any off the top of my head.

websap • today at 7:09 PM

If we don't innovate, someone else will. This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

➕ show 1 reply

alt Hacker News

Replies