8000 Perf issue with regexes that start with repeating digits · Issue #1708 · coreruleset/coreruleset · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Perf issue with regexes that start with repeating digits #1708
Closed
@CRS-migration-bot

Description

@CRS-migration-bot

Issue for tracking original pull request created by user allanrbo on date 2020-03-03 01:55:23.
Link to original PR: SpiderLabs/owasp-modsecurity-crs#1708.

HEAD is: 1c1ca6aaa2077d78fd3c6f0a03cef29fd14e5b85
BASE is: 8fe1a39
On the Slack channel 2020-03-02, Airween pointed out some PCRE performance issues with requests containing long consecutive strings of digits. He suggested a fix using the (*SKIP) flag. Since this flag is PCRE specific, I'd like to avoid it, in order to pave the way for modsec-alternatives that use regex libraries with guarenteed linear run time (RE2, Hyperscan, and Go's Regexp, for example).

This PR relates is a suggestion to modify 941120, 942210, and 942260.

Here's my understanding of the issue: The regexes in these rules all have the option to begin matching with a repeated number of digits, such as [0-9]+. For example the regex [0-9]+a would match 11111a, 111111111a, 11111111111111a, etc. Now the problem arises with for example a string like 11111 (without an a). In this example, PCRE has consumed 11111 and then looks for a, doesn't find it, and therefore backtracks to try instead with just 1111, then 111, then 11, and then finally try with 1. To avoid this backtracking, we can instead anchor the regex before the [0-9]+, to either the beginning of the string or some character that is not a digit. The fixed example then becomes (?:^|[^0-9])[0-9]+a.

Mirko suggested below a much cleaner approach than my first approach: simply don't bother with the quantifier. Updated the PR with this cleaner approach.

Tested manually with a large request as suggested by Airween, and verified that now perform many orders of magnitude faster.

There is also a perf issue with 942130, but I will send a PR for a suggestion to fix this separately, as I have (by chance) been working on some changes to this rule for some weeks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ⌛ Stale issueThis issue has been open 120 days with no activity.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0