logoalt Hacker News

Progressive encoding and decoding of 'repeated' protobuffer fields

15 pointsby quarkz02last Sunday at 12:46 PM2 commentsview on HN

Comments

pmarrecktoday at 12:56 AM

I recently came up with a novel arbitrary-integer-length binary encoding called BLIP (binary large integer prefix), which also begat a novel binary container format (which of course can be of arbitrarily large size)

https://github.com/pmarreck/BLIP

Anyway, a protobuf-relevant excerpt from the README:

    "BLIP is 4x faster than LEB128/Protobuf for large value 
    encoding (1.3 vs 5.1 ns/op) because it writes raw LE 
    bytes with a single memcpy instead of shifting and 
    masking 7 bits at a time."
Among other (I think) interesting features, there's a reserved bit (but only if it makes sense, if the value is multibyte) for endianness, which would potentially solve that whole problem (which is also documented in the README)

I am not really advertising it so much as wondering if it's actually useful to others as I don't normally operate at this level (I'm an Elixir dev by trade, although I seem to be drifting towards much lower levels... Zig)

fsaintjacquesyesterday at 10:40 PM

I recently toyed with a tiny VM for protobuf, you can use it in redpanda WASM streams, or in streaming pipeline (without having to use SAX parsers or IDL shenanigans with projected message compiled...) for filtering and projecting before deserializing. 100% written by claude.

https://github.com/fsaintjacques/wql