8000 Can DFA/NFA construction be offloaded to the regex_automata crate? · Issue #947 · lalrpop/lalrpop · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Can DFA/NFA construction be offloaded to the regex_automata crate? #947
Open
@Pat-Lafon

Description

@Pat-Lafon

Thanks for the report!

I spent some time profiling this. The hot spot is this function, which looks like it scales very poorly on an increase in the number of NFA states, and based on a cursory reading of this bit, it looks like a star repeat pattern is getting represented as one state, whereas a counted group of N repetitions is N states. So the repetition group hits the poor scaling in transitive_closure really badly and loops a lot.

Googling around, the theoretical algorithmic complexity to calculate a transitive closure is O(n^3), so I doubt we'll get much win there. The NFA states we store don't have internal state to handle a "number of repetitions" counter, so that's why we have to populate the graph with every state.

Based on all that, I don't see any options for a quick fix here. This seems like a design consequence of the high level algorithm, rather than something that can be locally fixed. I do wonder why we're doing all this building NFA states internally instead of handing it off to regex_automata. It's possible that some of this can be offloaded and the specialized library plausible could have better handling here. But at least personally, I'd need quite a bit more high level diving in to the code in order to take that on.

So, looping back to the high level question, I don't think you're doing anything "wrong" persay, but this seems to just be a fundamental issue with how lalrpop is handling regex tokens. My instinct is to say to use the ".*" approach and then check the length in the action code as a workaround.

Originally posted by @dburgener in #941 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0