logoalt Hacker News

stevefan199901/15/20263 repliesview on HN

Really I would love to know how parse context sensitive stuff like typedef which will have "switched" syntax for some tokens. Would like to know things like "hoisting" in C++, where you can you the class and struct after the code inside the function too, but I just find it hard to describe them in rigorous formal language and grammar.

Hacky solution for PEG such as adding a context stack requires careful management of the entry/exit point, but the more fundamental problem is that you still can't "switch" syntax, or you have to add all possible syntax combination depending on the numbers of such stacks. I believe persistent data structure and transactional data structure would help but I just couldn't find a formalism for that.


Replies

remexre01/15/2026

https://en.wikipedia.org/wiki/Lexer_hack

Make your parser call back into your lexer, so it can pass state to it; make the set of type names available to it.

luksenburg01/15/2026

Another possible solution is the usage of functional parsers (e.g.: [0]) and making use of some form of the ‘do’ notation. Each step makes its result available to all subsequent parsers.

[0] https://hackage.haskell.org/package/parsec

torginus01/15/2026

C/C++ has one of the worst-designed syntaxes, its such a shame that entire families of the most popular languages ended up copying the same mistakes.

I know it's no solace to you, but Rust and Go don't even have this problem Afaik, and it's avoidable by careful consideration.

show 1 reply