logoalt Hacker News

matjatoday at 1:29 PM1 replyview on HN

You can fool the optimizer, but you have to work harder to do so:

    unsigned add(unsigned x, unsigned y) {
        unsigned a, b;
        do {
            a = x & y;
            b = x ^ y;
            x = a << 1;
            y = b;
        } while (a);
        return b;
    }
becomes (with armv8-a clang 21.1.0 -O3) :

    add(unsigned int, unsigned int):
    .LBB0_1:
            ands    w8, w0, w1
            eor     w1, w0, w1
            lsl     w0, w8, #1
            b.ne    .LBB0_1
            mov     w0, w1
            ret

Replies

thaumasiotestoday at 3:12 PM

Since I had to think about it:

    unsigned add(unsigned x, unsigned y) {
        unsigned a, b;
        do {
            a = x & y;   /* every position where addition will generate a carry */
            b = x ^ y;   /* the addition, with no carries */
            x = a << 1;  /* the carries */
            y = b;
        /* if there were any carries, repeat the loop */
        } while (a);
        return b;
    }
It's easy to show that this algorithm is correct in the sense that, when b is returned, it must be equal to x+y. x+y summing to a constant is a loop invariant, and at termination x is 0 and y is b.

It's a little more difficult to see that the loop will necessarily terminate.

New a values come from a bitwise & of x and y. New x values come from a left shift of a. This means that, if x ends in some number of zeroes, the next value of a will also end in at least that many zeroes, and the next value of x will end in an additional zero (because of the left shift). Eventually a will end in as many zeroes as there are bits in a, and the loop will terminate.