8000 derive: fix cgroupv1 hid false derives by NDStrahilevitz · Pull Request #2453 · aquasecurity/tracee · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

derive: fix cgroupv1 hid false derives #2453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

NDStrahilevitz
Copy link
Collaborator

Initial Checklist

  • There is an issue describing the need for this PR.
  • Git log contains summary of the change.
  • Git log contains motivation and context of the change.
  • If part of an EPIC, PR git log contains EPIC number.
  • If part of an EPIC, PR was added to EPIC description.

Description (git log)

Author: Nadav Strahilevitz <nadav.strahilevitz@aquasec.com>
Date:   Thu Dec 8 11:08:17 2022 +0000

    derive: fix cgroupv1 hid false derives
    
    In cgroupv1 systems, cgroup_mkdir events come from many different
    hierarchies, some existing outside of cpuset, where we query for the
    container cgroups.
    Because these cgroup_mkdir events would go through a derive process
    anyway, for each controller in the system there would be a failed derive
    searching through the entire cpuset controller with no results.
    
    This fix adds a check in v1 systems to see if the hierarchy id is
    the default one, otherwise derivation is skipped.

Fixes: #issue_number

Type of change

  • Bug fix (non-breaking change fixing an issue, preferable).
  • Quick fix (minor non-breaking change requiring no issue, use with care)
  • Code refactor (code improvement and/or code removal)
  • New feature (non-breaking change adding functionality).
  • Breaking change (cause existing functionality not to work as expected).

How Has This Been Tested?

Tested with sonobuoy performance tests.
This bug:

  1. caused loss of events
  2. a bottleneck in deriveEvents and subsequently decodeEvents due to missing cgroup caches

pprof tests were ran in cgroup v1 system on k8s.

Final Checklist:

Pick "Bug Fix" or "Feature", delete the other and mark appropriate checks.

  • I have made corresponding changes to the documentation.
  • My code follows the style guidelines (C and Go) of this project.
  • I have performed a self-review of my own code.
  • I have commented all functions/methods created explaining what they do.
  • I have commented my code, particularly in hard-to-understand areas.
  • My changes generate no new warnings.
  • I have added tests that prove my fix, or feature, is effective.
  • New and existing unit tests pass locally with my changes.
  • Any dependent changes have been merged and published before.

Git Log Checklist:

My commits logs have:

  • Subject starts with "subsystem|file: description".
  • Do not end the subject line with a period.
  • Limit the subject line to 50 characters.
  • Separate subject from body with a blank line.
  • Use the imperative mood in the subject line.
  • Wrap the body at 72 characters.
  • Use the body to explain what and why instead of how.

Copy link
Collaborator
@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
A small nit to fix in the comments

@@ -15,6 +16,10 @@ func ContainerCreate(containers *containers.Containers) deriveFunction {

func deriveContainerCreateArgs(containers *containers.Containers) func(event trace.Event) ([]interface{}, error) {
return func(event trace.Event) ([]interface{}, error) {
// if cgroup_id is from non default hid (v1 case), this isn't a container, so we can skip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, this is a container, but we want to avoid duplicates and searches when not the default hid

@@ -15,6 +15,10 @@ func ContainerRemove(containers *containers.Containers) deriveFunction {

func deriveContainerRemoveArgs(containers *containers.Containers) deriveArgsFunction {
return func(event trace.Event) ([]interface{}, error) {
// if cgroup_id is from non default hid (v1 case), this isn't a container, so we can skip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

In cgroupv1 systems, cgroup_mkdir events come from many different
hierarchies, some existing outside of cpuset, where we query for the
container cgroups.
Because these cgroup_mkdir events would go through a derive process
anyway, for each controller in the system there would be a failed derive
searching through the entire cpuset controller with no results.

This fix adds a check in v1 systems to see if the hierarchy id is
the default one, otherwise derivation is skipped.
@NDStrahilevitz NDStrahilevitz force-pushed the fix/false_hid_container_derives branch from 06b4dd7 to 53f4011 Compare December 8, 2022 12:31
@NDStrahilevitz NDStrahilevitz merged commit f6c75b4 into aquasecurity:main Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0