logoalt Hacker News

Neywinyyesterday at 3:38 PM5 repliesview on HN

That's an incredible find and once I saw the assembly I was right along with them on the debug path. Interestingly it doesn't need to be assembly for this to work, it's just that that's where the split was. The IR could've done it, it just doesn't for very good reasons. So another win for being able to read arm assembly.

Unsure if this would be another way to do it but to save an instruction at the cost of a memory access you could push then pop the stack size maybe? Since presumably you're doing that pair of moves on function entry and exit. I'm not really sure what the garbage collector is looking for so maybe that doesn't work, but I'd be interested to hear some takes on it


Replies

Veservyesterday at 5:29 PM

You would normally use the “LDR Rd, =expr” pseudo-instruction form [1]. For immediates not directly constructible, it puts a copy of the immediate value in a PC-relative memory location, then does a PC-relative load into register.

So that would turn the whole sequence of “add constant to SP” into 2 executable instructions, 1 for constructing immediate and 1 for adding for a total of 8 bytes, and a 4 byte data area for the 17-bit immediate for a total of 12 bytes of binary which is 3 executable instructions worth.

[1] https://developer.arm.com/documentation/dui0801/l/A64-Data-T...

show 1 reply
pklausleryesterday at 5:40 PM

I'm a little surprised that this bug wasn't fixed in the assembler as a special case for immediate adds to RSP. If the patch was to the compiler only, other instances of the bug could be lurking out there in aarch64 assembly code.

show 2 replies
bloakyesterday at 3:49 PM

> So another win for being able to read arm assembly.

Yes, though that weird stuff with dollars in it is not normal AArch64 assembly!

The article could have mentioned the "stack moves once" rule.

show 3 replies
pjmlpyesterday at 4:14 PM

Usually in runtimes like Java and .NET there are safepoints exactly to avoid changing context in the middle of a set of instructions.

show 1 reply
titzeryesterday at 3:48 PM

I think the right fix is that the compiler should, e.g. load the constant into a register using two moves and then emit a single add. It's one more instruction, but then the adjustment is atomic (i.e. a single instruction). Another option is to do the arithmetic in a temp register and then move it back.