The fetishism of "byte count" (here, as "732 byte python script") needs to stop, especially when in a context like this where they're trying to illustrate a real failure modality.
Looking at their source code [1] it starts with this simple line:
import os as g,zlib,socket as s
And already I'm perplexed. "os as g"? but we're not aliasing "zlib as z"? Clearly this is auto-generated by some kind of minimizer? Likely because zlib is called only once, and os multiple times. As a code author/reviewer, I would never write "os as g" and I would absolutely never approve review of any code that used this.
Anyway, I could go on. :) Let's just stop fetishizing byte count
[1] https://github.com/theori-io/copy-fail-CVE-2026-31431/blob/m...
I don't see it as fetishizing byte count. I think of it as a proxy measure for how complicated or uncomplicated the exploit might be. They could just as well have said "we can do it in 3 lines of python" or "the Shannon entropy of the script implementing the exploit is really small" and I would have interpreted it similarly.
Where do you see this "fetishizing" happening most often? It's a strange thing to counter-fetishize about.
I don't get the 732-byte thing either and while I think it's a relatively punchy and unusually informative landing page for named vulnerability there are little snags like this all over it.
But the fact that it's not a kernel-exec LPE and it's reliable across kernels and distributions is important; it's close to the maximum "exploitability" you're going to see with an LPE. Which the page does communicate effectively; it just gilds the lily.
llms love that though
"The honest solution: a clean 50-line cut" and so on, ad nauseam
While I agree that it doesn't make much sense to use a minimizer on code the reader could understand, the code-golfed byte count of a CVE repro communicates its complexity in a certain visceral way.
I started to take the exploit script apart and reformat it to be something readable. At about 1041 bytes it's actually readable. The heart of it also includes an encoded zlib compressed blob that's 180 bytes long ('78daab77...'). This is decompressed (zlib.decompress(d(BLOB)) to a 160 byte ELF header.
> I would absolutely never approve review of any code that used this.
How often do you review, and subsequently block the release, of PoCs in this sort of context? Sounds like you've faced this a lot.
I always thought code quality mattered less in those, as long as you communicate the intent.
This is pretty legible compared to the 90s C rootshell.org exploits.
It's just lazy AI* writing w/0 editing.
"Just" is doing a lot of work there, I'm so annoyed reading it.
It's like an anti-ad and they had pretty cool material to work with.
* Claude loves stacatto "Some numeric figure. Something else. Intensifier" (ex. the "exploitable for a decade." or whatever sentences)
> Anyway, I could go on.
Then go on. zlib is only used once, so "zlib as z" in exchange for using z once doesn't get you anything. Using os directly and not renaming it g saves you 2 bytes though. But in this age where AI outputs reams of code at the drop of a hat, why shouldn't we enjoy how small you can get it to pop a root shell?
https://gist.github.com/fragmede/4fb38fb822359b8f5914127c2fe...
edit: If we drop offset_src=0 and just pass in 0 positionally, it comes down to 720.
>As a code author/reviewer, I would never write "os as g" and I would absolutely never approve review of any code that used this.
lucky for them, its an exploit script, not enterprise code.
all that needs to be "reviewed" is whether or not it exploits the thing its supposed to.
edit: yall really think a 10-line proof of concept script needs to undergo a code review? wild.
Hilariously, "os as g" adds one more byte than it saves, since os is only used 4 times but the alias takes 5 extra bytes to save 4. And "socket as s" comes out even.
If you wanted real savings, you'd use "d=bytes.fromhex" instead of defining a function -- 17 bytes!! And d('00') -> b'\0' for -2 bytes.
We could easily get the byte count down further by using base64.b85decode instead of bytes.fromhex (-70 or so), but ultimately we're optimizing a meaningless metric, as you mention.