I know they said they didn't obfuscate anything, but if you hide imports/symbols and obfus...

7777332215 • yesterday at 4:47 PM • 7 replies • view on HN

I know they said they didn't obfuscate anything, but if you hide imports/symbols and obfuscate strings, which is the bare minimum for any competent attacker, the success rate will immediately drop to zero.

This is detecting the pattern of an anomaly in language associated with malicious activity, which is not impressive for an LLM.

Replies

Avamander • yesterday at 11:58 PM

I have seen LLMs be surprisingly effective at figuring out such oddities. After all it has ingested knowledge of a myriad of data formats, encryption schemes and obfuscation methods.

If anything, complex logic is what'll defeat an LLM. But a good model will also highlight such logic being intractable.

stared • yesterday at 6:31 PM

One of the authors here.

The tasks here are entry level. So we are impressed that some AI models are able to detect some patterns, while looking just at binary code. We didn't take it for granted.

For example, only a few models understand Ghidra and Radare2 tooling (Opus 4.5 and 4.6, Gemini 3 Pro, GLM 5) https://quesma.com/benchmarks/binaryaudit/#models-tooling

We consider it a starting point for AI agents being able to work with binaries. Other people discovered the same - vide https://x.com/ccccjjjjeeee/status/2021160492039811300 and https://news.ycombinator.com/item?id=46846101.

There is a long way ahead from "OMG, AI can do that!" to an end-to-end solution.

➕ show 1 reply

akiselev • yesterday at 5:38 PM

When I was developing my ghidra-cli tool for LLMs to use, I was using crackmes as tests and it had no problem getting through obfuscation as long as it was prompted about it. In practice when reverse engineering real software it can sometimes spin in circles for a while until it finally notices that it's dealing with obfuscated code, but as long as you update your CLAUDE.md/whatever with its findings, it generally moves smoothly from then on.

➕ show 1 reply

achille • yesterday at 6:44 PM

in the article they explicitly said they stripped symbols. If you look at the actual backdoors many are already minimal and quite obfuscated,

see:

- https://github.com/QuesmaOrg/BinaryAudit/blob/main/tasks/dns...

- https://github.com/QuesmaOrg/BinaryAudit/blob/main/tasks/dro...

➕ show 1 reply

hereme888 • yesterday at 7:00 PM

I've used Opus 4.5 and 4.6 to RE obfuscated malicious code with my own Ghidra plugin for Claude Code and it fully reverse engineered it. Granted, I'm talking about software cracks, not state-level backdoors.

halflife • yesterday at 5:54 PM

Isn’t LLM supposed to be better at analyzing obfuscated than heuristics? Because of its ability to pattern match it can deduce what obfuscated code does?

➕ show 1 reply

Retr0id • yesterday at 6:45 PM

Stripping symbols is fairly normal, but hiding imports ought to be suspicious in its own right.

alt Hacker News

Replies