What’s the state of the art of reverse engineering source code from binaries in the age of agentic c...

oofbey • yesterday at 9:35 PM • 4 replies • view on HN

What’s the state of the art of reverse engineering source code from binaries in the age of agentic coding? Seems like something agents should be pretty good at, but haven’t read anything about it.

Replies

roughly • yesterday at 9:48 PM

I think there’s a good possibility that the technology that is LLMs could be usefully trained to decode binaries as a sort of squint-and-you-can-see-it translation problem, but I can’t imagine, eg, pre-trained GPT being particularly good at it.

JasonADrury • yesterday at 9:39 PM

I've been working on this, the results are pretty great when using the fancier models. I have successfully had gpt5.2 complete fairly complex matching decompilation projects, but also projects with more flexible requirements.

TZubiri • yesterday at 9:46 PM

Nothing yet, agents analyze code which is textual.

The way they analyze binaries now is by using textual interfaces of command tools, and the tools used are mostly the ones supported by Foundation Models at training time, mostly you can't teach it new tools at inference, they must be supported at training. So most providers are focused on the same tools and benchmarking against them, and binary analysis is not in the zeitgeist right now, it's about production more than understanding.

➕ show 1 reply

refulgentis • yesterday at 9:37 PM

Agents are sort of irrelevant to this discussion, no?

Like, it's assuredly harder for an agent than having access to the code, if only because there's a theoratical opportunity to misunderstand the decompile.

Alternatively, it's assuredly easier for an agent because given execution time approaches infinity, they can try all possible interpretations.

➕ show 1 reply

alt Hacker News

Replies