Does anyone have a good EBNF notation for Sqlite? I tried to make a tree-sitter grammar, which produces C code and great Rust bindings for it. But they use some lemon parser. Not sure how to read the grammar from that.
Not EBNF or anything standard, but possibly readable enough. It is an LR(1) grammar that has tested on all the test cases in Sqlite's test suite at the time:
https://lrparsing.sourceforge.net/doc/examples/lrparsing-sql...
The grammer contains things you won't have seen before, like Prio(). Think of them as macros. It all gets translated to LR(1) productions which you can ask it to print out. LR(1) productions are simpler than EBNF. They look like:
symbol1 := symbol2 symbol3
symbol1 := symbol4 symbol3
symbol3 := token1 symbol2 token2
...
Documentation on what the macros do, and how to get it to spit out the LR1(1) productions is here:https://lrparsing.sourceforge.net/doc/html/
It was used to do a similar task the OP is attempting.
It looks pretty much like BNF. Not too far off, anyway. https://sqlite.org/src/doc/trunk/doc/lemon.html#syntax
Perhaps this ANTLR v4 sqlite grammar? [1]
--
1: https://github.com/antlr/grammars-v4/tree/master/sql/sqlite
The lemon tool that is used by SQLite can output the grammar as SQL database that you can manipulate. There is https://github.com/ricomariani/CG-SQL-author that goes way beyond and you'll need to create the Rust generation, you can play with it here with a Lua backend https://mingodad.github.io/CG-SQL-Lua-playground/ .
Also I'm collecting several LALR(1) grammars here https://mingodad.github.io/parsertl-playground/playground/ that is an Yacc/Lex compatible online editor/interpreter that can generate EBNF for railroad diagram, SQL, C++ from the grammars, select "SQLite3 parser (partially working)" from "Examples" then click "Parse" to see the parse tree for the content in "Input source".
I also created https://mingodad.github.io/plgh/json2ebnf.html to have a unified view of tree-sitter grammars and https://mingodad.github.io/lua-wasm-playground/ where there is an Lua script to generate an alternative EBNF to write tree-sitter grammars that can later be converted to the standard "grammar.js".