8000 Unable to match "x AND x AND x ..." · Issue #1066 · lalrpop/lalrpop · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Unable to match "x AND x AND x ..." #1066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ccleve opened this issue May 22, 2025 · 4 comments
Open

Unable to match "x AND x AND x ..." #1066

ccleve opened this issue May 22, 2025 · 4 comments

Comments

@ccleve
Copy link
ccleve commented May 22, 2025

I'm trying to match a list of Exprs connected by "AND". Nothing seems to work.

Here's one attempt:

<starts:(<Expr> "And")+> <end:Expr> => {...},

which yields:

  ./src/query/lalrpop/query_parser.lalrpop:34:5: 34:82: Multiple productions for the same reduction

    The following symbols can be reduced into a Expr2 in two ways
      (<Expr2> "And")+ Expr1 "And"
    They could be reduced using the production on line 34:
    <starts:(<Expr> "And")+> <end:Expr> => And::new_with_first(head, tails).into()
    ...or using the production on line 15:
    Expr

  ./src/query/lalrpop/query_parser.lalrpop:38:5: 38:60: Conflict detected

      when in this state:
    (<Expr2> "And")+ = Expr2 (*) "And" ["Word", "OtherWord", "And", "Or", "Not", "(", ")", "DoubleQuote", "+", "-", "Field", "WordWithWildcards", Eof]
    Expr = Expr Expr2 (*) ["Word", "OtherWord", "And", "Or", "Not", "(", ")", "DoubleQuote", "+", "-", "Field", "WordWithWildcards", Eof]

    and looking at a token `"And"` we can reduce to a `Expr` but we can also shift

  ./src/query/lalrpop/query_parser.lalrpop:34:13: 34:27: Local ambiguity detected

    The problem arises after having observed the following symbols in the input:
      (<Expr2> "And")+ Expr2 "And"
    At that point, if the next token is a `"Word"`, then the parser can proceed in
    two different ways.

    First, the parser could execute the production at
    ./src/query/lalrpop/query_parser.lalrpop:34:13: 34:27, which would consume the
    top 3 token(s) from the stack and produce a `(<Expr2> "And")+`. This might
    then yield a parse tree like
      (<Expr2> "And")+ Expr2 "And" Expr1
      ├─(<Expr2> "And")+─────────┘     │
      └─Expr2──────────────────────────┘

    Alternatively, the parser could execute the production at
    ./src/query/lalrpop/query_parser.lalrpop:34:13: 34:27, which would consume the
    top 2 token(s) from the stack and produce a `(<Expr2> "And")+`. This might
    then yield a parse tree like
      Expr2          "And" Expr1
      ├─(<Expr2> "And")+─┘     │
      └─Expr2──────────────────┘

    See the LALRPOP manual for advice on making your grammar LR(1).

I can't make heads or tails of this.

In another attempt I had a lone Expr first, followed by ("And" <Expr>)+ and got a similar local ambiguity error.

This problem is similar to the Comma problem describe here: https://lalrpop.github.io/lalrpop/tutorial/006_macros.html. I attempted to use the Comma macro straight out of the docs and got a voluminous error message similar to the one above.

I don't get it. Where's the ambiguity?

@chanbengz
Copy link
Contributor
chanbengz commented May 22, 2025

Aha, it is an associative and precedence problem. See https://lalrpop.github.io/lalrpop/tutorial/004_full_expressions.html.

The error msg already told you that

    First, the parser could execute the production at
    ./src/query/lalrpop/query_parser.lalrpop:34:13: 34:27, which would consume the
    top 3 token(s) from the stack and produce a `(<Expr2> "And")+`. This might
    then yield a parse tree like
      (<Expr2> "And")+ Expr2 "And" Expr1
      ├─(<Expr2> "And")+─────────┘     │
      └─Expr2──────────────────────────┘

    Alternatively, the parser could execute the production at
    ./src/query/lalrpop/query_parser.lalrpop:34:13: 34:27, which would consume the
    top 2 token(s) from the stack and produce a `(<Expr2> "And")+`. This might
    then yield a parse tree like
      Expr2          "And" Expr1
      ├─(<Expr2> "And")+─┘     │
      └─Expr2──────────────────┘

it could be either left-associative or right-associative, and that's the ambiguity. So IMO, if the #[precedence(level="1")] #[assoc(side="left")] won't work for you, then don't do the macro. Instead, a recursive one will handle this case easily.

#[precedence(level="1")] #[assoc(side="left")]
<starts:Expr> "And" <end:Expr> => {...},
8000

@ccleve
Copy link
Author
ccleve commented May 22, 2025

@chanbengz Thanks. The recursive approach works, but it produces a tree like this: (((x AND x) AND x) AND x). That's what I'm trying to avoid because it requires a post-processing step to flatten it into (x AND x AND x AND x).

@chanbengz
Copy link
Contributor

Post-processing is inevitable if you'd evaluate expressions, not to mention that recursive form has advantage that the precedence is clear if enables braces like (Expr And (Expr Or Expr)).

precedence only works for binary combination like Expr -> Expr Op Expr so it won't take effect for macro. And I think this ambiguity cannot be solved unless we add an ad-hoc solution to the parse table generation, which I'd rather not do.

@chanbengz
Copy link
Contributor

If you're trying to match a sequence of conjunctive expression and flatten it, my approach would be to define a variant Expr::AndSeq(Vec<Expr>) firstly, and append the following Exprs to it, say

<starts:Expr> "And" <end:Expr> => {
    match starts {
        Expr::AndSeq(ref mut v) => v.push(Expr::Something(end));
        _ => panic!("not an And expression")
    }
    starts
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0