logoalt Hacker News

gritzko11/08/20243 repliesview on HN

I have the experience of writing parsers (lexers) in Ragel, using Go, Java C++, and C. I must say, once you have some boilerplate generator in place, raw C is as good as the Rust code the author describes. Maybe even better because simplicity. For example, this is the most of code necessary to have a JSON parser: https://github.com/gritzko/librdx/blob/master/JSON.lex

In fact, that eBNF only produces the lexer. The parser part is not that impressive either, 120 LoC and quite repetitive https://github.com/gritzko/librdx/blob/master/JSON.c

So, I believe, a parser infrastructure evolves till it only needs eBNF to make a parser. That is the saturation point.


Replies

dvdkon11/08/2024

That repetitivness can be seen as a downside, not a virtue. And I feel that Rust's ADTs make working with the resulting syntax tree much easier.

Though I agree that a little code generation and/or macro magic can make C significantly more workable.

djoldman11/08/2024

I love love love ragel.

Won't the code here:

https://github.com/gritzko/librdx/blob/master/JSON.lex

accept "[" as valid json?

   delimiter = OpenObject | CloseObject | OpenArray | CloseArray | Comma | Colon;
   primitive = Number | String | Literal;
   JSON = ws* ( primitive? ( ws* delimiter ws* primitive? )* ) ws*;
   Root = JSON;
(pick zero of everything in JSON except one delimiter...)

I usually begin with the RFCs:

https://datatracker.ietf.org/doc/html/rfc4627#autoid-3

I'm not sure one can implement JSON with ragel... I believe ragel can only handle regular languages and JSON is context free.

show 1 reply
pornel11/09/2024

C/Ragel bug was the cause of Cloudbleed, and the reason why Cloudflare switched to Rust.

https://en.wikipedia.org/wiki/Cloudbleed

show 1 reply