Lichess uses a scheme which is probably more efficient on average, described on revoof's blog[0]. Basically, it's a variable length scheme where the first 64 bits encode square occupancies, followed by piece codes (including castling, side to move, and ep with some trickery), followed by half-move clocks if necessary.
0: https://lichess.org/@/revoof/blog/adapting-nnue-pytorchs-bin...
I remember asking myself this question years ago, and came to 162 bits. I was just a kid back then so the logic is probably wrong but I do wonder how simple the encoding could be under those constraints...
Edit: Here are the Notes
0 Empty
10 Pawn
1100 Knight
1101 Rook
1110 Bishop
1111 Queen
32 + 32 + 472
2 times 6 bits: position of the kings
30 bits: color mask
120 + 2*6 + 30 = 162 bits
We can store the rest using the methods from the blog post and add 18 bits for promotion, giving 180 bits.
I'm sure this isn't the most efficient way, and I think I had other methods and considered things like the bishops being able to occupy 32 squares, though special casing doesn't make sense because of promotions.
Technically if you got 8 bishops/queens/knights/rooks You would occupy another 16 bits, giving 196 bits
I think the upper limit can be reduced at the cost of increasing the lower limit
EDIT2: I think I made the assumption at the time that to promote one piece you needed to capture at least one enemy pawn, giving the space for the two bits, which means the upper bound is actually 180 bits
Would love to see other people try in the comment section
26 bytes is 208 bits, about twice what you really need for a minimal encoding that has enough context (en passant, castling) to generate an accurate set of legal moves. I wrote a chess database tool back in the 90's (CDB) that used 96-bit encodings (if memory serves) to index all the positions reached in a collection of games so that one could see the moves made from any position, their frequencies, and their game outcomes. Good fun.
To extend this to entire game histories, here's the Lichess blog post, "Compressing Chess Moves Even Further, To 3.7 Bits Per Move": https://lichess.org/@/marcusbuffett/blog/compressing-chess-m...
This nerd-sniped me. I think we should separate "chess board" and "chess game state" as two different problems. For "chess board", we don't consider any castling, en-passant or similar. We might store a bit for "who goes next". This is useful for chess puzzles where state shenanigans are rare (I think).
But if we store castling state, I think we are already trying to store the whole game state, and this is representable only by full history because of move repeating rules.
So, I think storing board state as a sequence of moves is more interesting. I would estimate that a number of possible actions on average is closer to 8 than to 16, so it would give as 3 bits for half-move and 6 bits for full move. 24-move game could be represented with 18 bytes, which is considerably lower than 26 bytes!
You can get close to average bit per move, if you reuse "spare" places for the next move. So, for instance, first move have 20 possibilities, which is representable by 5 bits, but you can reuse "spare" 32-20=12 possibilities as a bit for the next move.
This is a representation assuming you use only "move validator" thing that returns a list of possible moves. I think that if you use a chess engine that would output you a probability distribution of possible moves, you can compress noticeably better on average, but decoding would be slow.
Previously discussed (2022):
> Since each position takes up 6 bits ($2^6 = 64$), multiplying 6 bits by 32 pieces gives us 192 bits / 24 bytes (1 byte = 8 bits).
But each position can only be used once, so you really only have 64*63*62*61*...*33 possibilities = 64!/32! = ~2^53, so you could encode this with less than 7 bytes, and then use the basic 12-byte encoding for captures, castling, en-passant, and promotions, and you are below 19 bytes in total. (Did I miscalculate this?)
Also pawns can never be on the 1st or 8th rank so you can subtract those possibilities from each of the numbers that represents a pawn.
You also don't care about the order of bishops, rooks, knights, and especially pawns - so you should be able to do even better.
This is fun. Of course this problem is also a fun way to consider an upper bound on the total number of board states and therefore how hard it is to 'solve' chess compared to a game like checkers. Hitting the calculator 26 bytes works out to chess being no more than 4.113761393×10⁶² possible states. I'll start my GPU solving that right now!
[edit] This made me look for articles estimating this and I found this one [1] which confirms the above is in the right ballpark. Actual study (according to the article) says 4.822 x10^44 is their upper bounds
[1] https://chess-grandmaster.com/how-many-possible-chess-positi...
I have a lot more to optimize before I'm crunching down the positions but I just made a chess platform[0] with the intention of tracking your play style over many games (integrated with chess.com only for now) because the other ones I've used (including chess.com somehow) only really analyze a game at a time. It was a lot of fun to build and it's been really useful for me to identify some weaknesses and have a 'coach' to talk through them with and replay positions. I'd love feedback from any chess players! (email is in my bio)
I don't understand the castling part of this - you can move a rook from its starting square and back and castling isn't available - it says that you can determine whether castling is available from the location of the pieces?
Why not do it simpler? : Create an array with 16 elements, one element per piece, black + white. Every array element is 7 bits wide, 1 bit for captured or not, and 6 bits for the square number the piece is on (8 x 8). Then you need 16 * 7 = 112 bits = 14 bytes. (And the captured-bit can even be compressed further as a 65th square, but that makes it more calculation intensive to extract a position)
I wish these articles acknowledged that densely packed structures like that have significant overhead in terms of the instructions which must be generated to parse them. If that shit gets inlined all over the place, how much bigger is the binary now? Absolute minimalism is rarely the right choice, the size of .text matters too.
The king can't be captured, so the capture bit for the kings can be used for something else.
Bishops only need 5 bits instead of 6 (they can't move to a square of different color), shaving 2 bits and thus reaching exactly 26 bytes.
BTW 495 can be computed as a binomial coefficient C(8+5-1,5-1), the number of combinations of 8 elements chosen with repetitions from 5 elements.
I think didn't mention the bit for whose move it is? Luckily, he has a few spare bits.
what's wrong with scrolling on that blog? arrow keys don't work
I would do it like this. There are 32 pieces, any of which may be missing. So let's use four bytes (32 bits) as a mask of what pieces are present. Then, coordinates can be given for each piece that is present. The board is 8x8, so coordinates can be encoded as pairs of 3 bits, e.g. A3 is 000:011. The worst case is that we need 32 of these pairs, which requires 24 bytes. That brings us to a worst case of 28.
How can we eliminate two bytes?
We can more cleverly encode the mask so that two bytes are used to express the all-pieces-present, and certain other cases. A fully decoded mask is only used for boards that have too few pieces to break over 26 bytes.
For instance suppose we have a two-bit header encoding several cases:
00 - no piece are missing (32 six-bit coordinates follow)
01 - one piece is missing, followed by 5 bit ID of that piece
10 - two pieces are missing, followed by two 5 bit IDs of the two pieces
11 - three or more pieces missing, full mask follows.
In case 11, we need a 32 bit mask, followed by as many as 29 coordinate pairs.
So 2 + 32 + 6 * 29 = 208 bits.
And hey look, by dumb luck, 208 / 8 = 26.
I will check the article later.
There's a logic error from assuming that because the rook is in its original position that the rook has not moved. Also I'm not sure if en passant is available if the pawn has moved from its home file, even if it subsequently moved back - so you can't assume either of these just by looking at the piece's position.
I think that you need one extra bit, that can contextually encode "rook has moved" or "en passant available".
I know of two distinct methods of encoding any legal chess position into 24 bytes worst case. In both cases, you get the full position, plus who to move, plus full information on future castling and en-passant possibilities. This is the FEN state of the board, minus the two counts. It's more than the information you get from a published chess diagram in a book or magazine. Although in a book or magazine inevitably "who to move" is represented somehow, castling and en-passant possibilities are not usually.
Method 1: Lichess method; 64 bit header, 1 bit per square indicating occupied squares, then (up to) 32 4 bit codes for the 32 occupied squares. So 24 bytes worst case!
Method 2: My own method a list of 2 bit codes, one of the 4 codes indicates empty square, the other three codes are prefixes for a second 2 bit code. Three prefixes applied to one of 4 code values gives a total of 12 possibilities corresponding to any possible chess piece. Worst case 32x2 bits plus 32x4 bits = 24 bytes.
In each case there is sufficient entropy to create tricks to add the supplementary information (who to move etc.), similar tricks are in the original article.
I mention my own method from my Tarrasch Chess GUI https://github.com/billforsternz/tarrasch-chess-gui only for completeness. If I had known about method 1 I would have used that, it is simpler and better and there is much more entropy available making the tricks easier.
I would urge commentators to keep clear the difference between compressing a chess position (this challenge), a chess move and a chess game. A chess move needs far less bits of course. A complete chess game is always encoded as a list of moves, not positions, for this reason.
Edit: I should have mentioned that the chief advantage of method 1 over method 2 is average bits required. An empty board is 64x1 bits = 8 bytes for method 1 and 64x2 bits = 16 bytes for method 2.
Edit 2: I am going to describe my tricks, just because they are fun. Two kings the same colour means Black to move. Two White kings means the first king is white, two black kings means the first king is black. Otherwise White to move. A friendly pawn on the first rank means an enpassant vulnerable pawn. Swap it with the 4th rank square to get the actual contents of both squares. A hostile pawn on the first rank is an unmoved rook or king, establishing all castling rights. The castling and en-passant tricks can be combined in a pleasant and harmonious way.
In my head I count 30 bytes, since all 16 pawns can be underpromoted to a bishop rook or knight.
"The 'bit-level magic' here is misleading. They're using integer representations and calling them bits. A true bit-level approach would encode positions as pure binary streams. For example, their promotion string '00000034' is 8 bytes (64 bits), not the claimed 9 bits. Has anyone implemented this with actual bitwise operations instead of integer packing?
TLDR: You stupi...lovely folks, need learn what a bit is. What is a byte, and why integers are NOT BITS.
And no one called this post out from years ago? Crazy.
Someone ask, and I will show you how to do it in 24 BYTES IN BINARY ENCODING FOR REAL - no fake "integers are bits". And can we use compression techniques? We can get this to 10-12 bytes maybe.
So it beats the fake "26 bytes" and my version is actually real binary bits.
Ask.
Very clever, but that's the problem, clever is never the correct solution.
With a few bytes more more you can create an implementation that is a lot easier to understand. Bytes are cheap, developer time isn't.
Either we have to say that the position does not dictate the possible moves, or that this does not fully capture the position. The problem here is that drawing can become an option or a requirement based on information that this representation doesn't capture.
First the simpler version of this problem. After 50 full moves without a capture or pawn move, a draw MAY be claimed. After 75 moves, a draw MUST be claimed. This requires a count to be kept that may require up to 7 more bits.
The bigger problem is draw by repetition. If a position repeats exactly (same castling and en passant options) for a third time, then a draw MAY be claimed. If it repeats exactly for a fifth time, then a draw MUST be claimed. (Usually it is claimed on the third time, but you don't have to.) Applying this rule correctly requires not just knowing the current position, but what positions have occurred previously, and how often. Back to the last pawn move, capture, or change in potential castling status. This may require (per the first rule) knowing what up to 75 different past positions were.
The best way to store this history is almost certainly not as a list of positions, but as a history of moves. But, even if done efficiently, we will need more bytes for that history than we needed for the position.