LLMs should be trained on and directly output binary.
It should not. Abstraction in software engineering brings intelligence. (compression correlates to intelligence)
Generative algorithms have been studied for decades now and while they have led to some interesting results they're a bad fit for LLMs because there's no such thing as a "plausible" binary: a small perturbation yields an unusable result.
Technically they are, just a subset. But still a practical one, they're frequently used to produce executable files.
[flagged]
[flagged]
I think you forgot the "/s"
On the off chance that you’re serious, that would result in disastrously bad output. The difference between “jmp $+15” and “jmp $+16” is inscrutable and the LLM would not be able to pick the right one without tooling.
That tooling is a compiler. The higher level, the better chance the LLM can be steered to good output. Machine code is hopeless, don’t bother.