I remember asking myself this question years ago, and came to 162 bits. I was just a kid back then so the logic is probably wrong but I do wonder how simple the encoding could be under those constraints...
Edit: Here are the Notes
0 Empty
10 Pawn
1100 Knight
1101 Rook
1110 Bishop
1111 Queen
32 + 32 + 472
2 times 6 bits: position of the kings
30 bits: color mask
120 + 2*6 + 30 = 162 bits
We can store the rest using the methods from the blog post and add 18 bits for promotion, giving 180 bits.
I'm sure this isn't the most efficient way, and I think I had other methods and considered things like the bishops being able to occupy 32 squares, though special casing doesn't make sense because of promotions.
Technically if you got 8 bishops/queens/knights/rooks You would occupy another 16 bits, giving 196 bits
I think the upper limit can be reduced at the cost of increasing the lower limit
EDIT2: I think I made the assumption at the time that to promote one piece you needed to capture at least one enemy pawn, giving the space for the two bits, which means the upper bound is actually 180 bits
Would love to see other people try in the comment section
Each pawn that wants to be promoted either takes: (a) another 'special' piece (knight/rook/bishop/queen), in which case it has already bought enough bit budget to later be promoted; or (b) another pawn, in which case this temporarily saves 1 bit (as the other pawn becomes a space), but then later we need 2 extra bits for the promotion, so we pay 1 bit extra per pawn in total
In the case of (b) there are now fewer pawns that can be promoted, and so worst case, we have to pay a budget of 1 bit per each of 8 promoted pawns.
So I think maximum required bits is only 162 + 8 = 170?
Considering at least half of all squares are empty, further compression is in order for the empty space.
Also if you're encoding the king as a position instead of a byte sequence you would have to encode their space as empty, that's an extra 2 bits