8000 feat: update word list for 932232 by theseion · Pull Request #3276 · coreruleset/coreruleset · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: update word list for 932232 #3276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Sep 18, 2023

Conversation

theseion
Copy link
Contributor
@theseion theseion commented Aug 7, 2023

Fixes #2692.

@theseion theseion changed the title 2692 update word list for932232 feat: update word list for 932232 Aug 7, 2023
@theseion theseion force-pushed the 2692-update-word-list-for932232 branch from 7fd5d28 to 7e83627 Compare August 7, 2023 13:35
@dune73
Copy link
Member
dune73 commented Aug 7, 2023

Coming along nicely. How much more until this is a review-ready state?

Also: I presume the update to this list can not be automated. Please confirm.

@theseion
Copy link
Contributor Author
theseion commented Aug 7, 2023

I need to fix the tests, then it will be review ready. I will also see whether I can at least add one or two shell expressions to regenerate some of the lists from others.

The list for PL 3 cannot be automated, unless we decide to throw away the current contents and come up with some rule to include certain entries in that list.

@dune73
Copy link
Member
dune73 commented Aug 7, 2023

Got you. I would rather refrain from too much acrobatics. I reckon we will revisit the automation for 4.1 and consolidate some list updates via the toolchain. That would be the moment to link the srcs of various rules.

Copy link
Contributor
@franbuehler franbuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this huge PR!!
It looks good to me. I found a few minor typos.
I generated the regexes too and got the same result.
It looks good. Maybe a second pair of eyes would be good. But LGTM in my opinion.

@Xhoenix
Copy link
Member
Xhoenix commented Aug 9, 2023

I've some additions for unix-shell.data but don't have write permissions here.

@Xhoenix
Copy link
Member
Xhoenix commented Aug 9, 2023

These are the additions in my fork:- Xhoenix@b9b6361

@dune73
Copy link
Member
dune73 commented Aug 9, 2023

Nice addition @Xhoenix, but what the heck is /bin/cd?

@Xhoenix
Copy link
Member
Xhoenix commented Aug 9, 2023

My bad. Just adding stuff that came to my mind, forgot it was a builtin.

@dune73
Copy link
Member
dune73 commented Aug 9, 2023

Same for /bin/fg and /bin/ll?

@theseion
Copy link
Contributor Author

Thanks @Xhoenix, I'll merge those.

@dune73
Copy link
Member
dune73 commented Aug 31, 2023

Status as I see it:

  • Merge stuff from @Xhoenix
  • Fix tests
  • Then final review and merge

Correct?

@Xhoenix
Copy link
Member
Xhoenix commented Sep 5, 2023

I opened a new PR for the unix-shell.data with a few extra additions
#3294

@theseion
Copy link
Contributor Author
theseion commented Sep 6, 2023

Status as I see it:

Merge stuff from @Xhoenix
Fix tests
Then final review and merge
Correct?

More or less :)

@theseion theseion force-pushed the 2692-update-word-list-for932232 branch 2 times, most recently from f0c95a0 to 3da54e3 Compare September 7, 2023 07:57
@theseion theseion marked this pull request as ready for review September 7, 2023 16:00
@theseion
Copy link
Contributor Author
theseion commented Sep 7, 2023

A couple of notes:

  • I've tried to add as much automation as possible. The scripts aren't perfect but should at least make things reproducible
  • Updating a list is still somewhat cumbersome, as multiple other files may need to be updated. The individual steps are automated though (as far as possible)
  • I've added a new flag to spell.sh that I need for one of the automations
  • I've added a new rule 932238 at PL3 to match all words that have been excluded at PL 2. We already had such a rule but only for User-Agent / Referer headers
  • For one of the lists we had forgotten to use exclusions at all, I've fixed that
  • For PL 3 I used a short list with only the excluded words at the previous level. I don't like that very much because it means we have to maintain a separate list. I still did it because at PL 3 the rule would otherwise be identical to the rule at PL 2 for only ~10 additional words. Not sure which approach is better.

@dune73
Copy link
Member
dune73 commented Sep 8, 2023

Wow, impressive.

So ready for review?

@theseion
Copy link
Contributor Author
theseion commented Sep 8, 2023

So ready for review?

Yes, that's why I requested your review ;)

Note that I updated the comment above with an additional point.

With the changes to exclusions of unix shell command words, some tests
no longer match. These have been turned into negative tests and copied
to the next PL level rule as positive tests to test that exculsions work
as intended.
The new automation script runs the list trough spell.sh to find english
words. These will be suffixed with `{{space-or-redirect}}` if they
aren't already.
Extend the automation script for unix-shell-upto3-for-cmdline.ra by
running the words through spell.sh and suffixing English words with `@`
automatically.
Add automation to unix-shell-fps-pl1.ra to add English words from the
source lists automatically
The words passed to spell.sh may have suffixes that must be removed
before passsing the words to wordnet. This commit adds the `-s` /
`--suffix` flag which takes a regular expression used to strip suffixes
from words.
The FreeBSD version of sed (as present under macOS) is incompatible with
GNU sed w.r.t. the in-place flag `-i`. To ensure that the scripts work
acrsoss platforms this commit replaces `sed -i` invocations with
equivalent `ed` commands.
The FP exclusion files contain all variants of a word. Explain why that
makes sense.
Various fixes to the scripts (some things only workded because I was
pasting the script into my zsh shell). Scripts now run reliably under
bash.
Some commands like `clang++` need to be escaped because they contain
regular expression meta characters.
Some commands include a `.`. This character should be escaped, since
it is, technically, a regular expression meta token.
@theseion theseion force-pushed the 2692-update-word-list-for932232 branch from 0c4cad6 to 76a5a39 Compare September 18, 2023 14:49
@theseion
Copy link
Contributor Author

Adding the two variables to unix-shell.data means they are detected by 932160. I've moved the tests there.

@dune73
Copy link
Member
dune73 commented Sep 18, 2023

I think we're good. I suggest to remove mandatory review by @fzipi. @franbuehler and I looked at this in detail.

Ready to be merged as far as I am concerned.

@fzipi fzipi requested review from dune73 and franbuehler September 18, 2023 15:40
Copy link
Member
@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦭 of ✅

@fzipi fzipi merged commit 4b729b1 into coreruleset:v4.0/dev Sep 18, 2023
@theseion theseion deleted the 2692-update-word-list-for932232 branch September 18, 2023 16:11
@theseion
Copy link
Contributor Author

"seal of checkmark"?? 🧐

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update word list for rule 932232 (RCE Unix command injection part 4/4)
5 participants
0