Can confirm. Matching decompilation in particular (where you match the compiler along with your gues...

jeffmcjunkin • yesterday at 9:57 PM • 2 replies • view on HN

Can confirm. Matching decompilation in particular (where you match the compiler along with your guess at source, compile, then compare assembly, repeating if it doesn't match) is very token-intensive, but it's now very viable: https://news.ycombinator.com/item?id=46080498

Of course LLMs see a lot more source-assembly pairs than even skilled reverse engineers, so this makes sense. Any area where you can get unlimited training data is one we expect to see top-tier performance from LLMs.

(also, hi Thomas!)

Replies

stackghost • yesterday at 10:47 PM

My own experience has been that "ghidra -> ask LLM to reason about ghidra decompilation" is very effective on all but the most highly obfuscated binaries.

Burning tokens by asking the LLM to compile, disassemble, compare assembly, recompile, repeat seems very wasteful and inefficient to me.

➕ show 2 replies

echelon • today at 2:22 AM

Has anyone used an LLM to deobfuscate compiled Javascript?

➕ show 1 reply

alt Hacker News

Replies