logoalt Hacker News

thesz11/08/20242 repliesview on HN

How can one define an infinite grammar in Rust?

E.g., a context-free rule S ::= abc|aabbcc|aaabbbccc|... can effectively parse a^Nb^Nc^N which is an example of context-sensitive grammar.

This is a simple example, but something like that can be seen in practice. One example is when language allows definition of operators.

So, how does Rust handle that?


Replies

ryandv11/08/2024

In Haskell I think it's something like:

    {-# LANGUAGE OverloadedStrings #-}
    import Data.Attoparsec.Text
    import qualified Data.Text as T

    type ParseError = String

    csgParse :: T.Text -> Either ParseError Int
    csgParse = eitherResult . parse parser where
    parser = do
        as <- many' $ char 'a'
        let n = length as
        count n $ char 'b'
        count n $ char 'c'
        char '\n'
        return n

    ghci> csgParse "aaabbbccc\n"
    Right 3
show 1 reply
jeroenhd11/08/2024

Using parser combinator library "nom", this should probably do what you'd want:

    fn parse_abc(input: &str, n: usize) -> IResult<&str, (Vec<char>, Vec<char>, Vec<char>)> {
      let (input, result) = tuple(( many_m_n(n, n, char('a')),
                                  many_m_n(n, n, char('b')),
                                  many_m_n(n, n, char('c'))
                            ))(input)?;
      Ok((input, result)) 
    }
    
It parses (the beginning of) the input, ensuring `n` repetitions of 'a', 'b', and 'c'. Parse errors are reported through the return type, and the remaining characters are returned for the application to deal with as it sees fit.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

show 1 reply