logoalt Hacker News

Retr0idyesterday at 8:04 PM1 replyview on HN

Putting domain separators in the IDL is interesting but you can also avoid the problem by putting the domain separators in-band (e.g. in some kind of "type" field that is always present).

Tangentially, depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).

Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html


Replies

majormajoryesterday at 8:18 PM

I think a lot of people assume that the "name" of the type, for protos, will be preserved somewhere in the output such that a TreeRoot couldn't be re-used as a KeyRevoke. It makes sense that it isn't - you generally don't want to send that name every time - but it's non-obvious to people with a object-oriented-language background who just think "ah, different types are obviously different types." The serialization cost objection is generally what I've often seen against in-bound type fields and such, as well, so having a unique identifier that gets used just for signature computation is clever.

What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?

show 2 replies