8000 lsm: cgroup attachment type support by altugbozkurt07 · Pull Request #1135 · aya-rs/aya · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

lsm: cgroup attachment type support #1135

8000
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

altugbozkurt07
Copy link
@altugbozkurt07 altugbozkurt07 commented Jan 14, 2025

Hi @vadorovsky, @dave-tucker,

This is the refactored work based on the discussion we have had on discord.
Let me know if i missed anything.

Best

This change is Reviewable

Copy link
netlify bot commented Jan 14, 2025

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 2c3eaea
🔍 Latest deploy log https://app.netlify.com/projects/aya-rs-docs/deploys/684050dc5d172b0008fb4bfd
😎 Deploy Preview https://deploy-preview-1135--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@mergify mergify bot added aya This is about aya (userspace) aya-bpf This is about aya-bpf (kernel) aya-obj Relating to the aya-obj crate test A PR that improves test cases or CI labels Jan 14, 2025
@tamird
Copy link
Member
tamird commented Jan 14, 2025

Please avoid opening a new PR each time. There are comments I left in #1131 that remain unaddressed.

@altugbozkurt07
Copy link
Author

@tamird sorry, since we have changed the way we implemented api, i thought it deserved a new pr.

For the comments that remain unaddressed;
1- nix package is used in init crate so that is why i left it there. If you still want me to remove it from workspace and include it in specific crates where its used.
2- Done
3- Done
4- Removed the empty comment line
5- The changes proposed in this pr

Am i missing something other than what is stated in your comments?

Copy link
Member
@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've take a quick pass over and there are a few nits that need clearing up.
Please also check that the docs build and render correctly 🙏

@@ -44,12 +46,12 @@ use crate::{
/// program.attach()?;
/// # Ok::<(), LsmError>(())
/// ```
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-instate this line unless you're fixing a docs rendering issue.

/// The minimum kernel version required to use this feature is 6.0.
///
/// # Examples
/// ## LSM with cgroup attachment type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this subheading

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i remove this subheading, should i also remove it from lsm.rs ?

@tamird
Copy link
Member
tamird commented Jan 14, 2025

Please let us know when the tests are passing, or if you need help understanding the failures.

@altugbozkurt07
Copy link
Author

@dave-tucker thanks for your detailed feedback, i have updated the commit accordingly. Let me know if things are good to go for this one.

Copy link
Member
@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing.

@altugbozkurt07 altugbozkurt07 force-pushed the lsm_cgroup_api branch 2 times, most recently from 97a52dd to f67626f Compare January 21, 2025 15:03
Copy link
mergify bot commented Jan 21, 2025

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

@mergify mergify bot added the api/needs-review Makes an API change that needs review label Jan 21, 2025
@mergify mergify bot requested a review from alessandrod January 21, 2025 15:04
Copy link
Member
@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 15 files at r1, 1 of 8 files at r2, 14 of 15 files at r3, all commit messages.
Dismissed @dave-tucker from 17 discussions.
Reviewable status: 20 of 23 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @altugbozkurt07)


-- commits line 2 at r3:
why is this a double colon?


xtask/src/run.rs line 416 at r3 (raw file):

                // Heed the advice and boot with noapic. We don't know why this happens.
                kernel_args.push(" noapic");
                kernel_args.push(" lsm=lockdown,capability,bpf");

Please add a comment.


test/integration-test/src/tests/lsm.rs line 36 at r3 (raw file):

    let cgroup_path = Path::new("/sys/fs/cgroup/lsm_cgroup_test");
    if !cgroup_path.exists() {

i think you can drop this check


test/integration-test/src/tests/lsm.rs line 40 at r3 (raw file):

    }

    let _ = prog.attach(File::open(cgroup_path).unwrap()).unwrap();

what's the deal with this let _?


test/integration-test/src/tests/lsm.rs line 42 at r3 (raw file):

    let _ = prog.attach(File::open(cgroup_path).unwrap()).unwrap();

    match unsafe { fork().expect("Failed to fork process") } {

why do we need this fork? if the goal is to show that per-pid filtering occurs, you aren't doing that right now.


test/integration-test/src/tests/lsm.rs line 50 at r3 (raw file):

            let mut f = File::create(cgroup_path.join("cgroup.procs"))
                .expect("could not open cgroup procs");
            f.write_fmt(format_args!("{}", pid.as_raw() as u64))

how about write!(&mut f, "{pid}")?


test/integration-test/src/tests/lsm.rs line 59 at r3 (raw file):

        ForkResult::Child => {
            assert_matches::assert_matches!(TcpListener::bind("127.0.0.1:12345"), Ok(listener) => assert_eq!(
                listener.local_addr().unwrap(), SocketAddr::V4(SocketAddrV4::new(Ipv4Addr::new(127, 0, 0, 1), 12345)))

do we need this assertion? probably all we care about is that it didn't error


test/integration-test/src/tests/lsm.rs line 80 at r3 (raw file):

    prog.attach().unwrap();

    assert_matches::assert_matches!(TcpListener::bind("127.0.0.1:12345"), Err(e) => assert_eq!(

we should also do this before attaching the program.

Copy link
Author
@altugbozkurt07 altugbozkurt07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 23 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @tamird)


-- commits line 2 at r3:

Previously, tamird (Tamir Duberstein) wrote…

why is this a double colon?

done


test/integration-test/src/tests/lsm.rs line 36 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i think you can drop this check

done


test/integration-test/src/tests/lsm.rs line 40 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

what's the deal with this let _?

done


test/integration-test/src/tests/lsm.rs line 42 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why do we need this fork? if the goal is to show that per-pid filtering occurs, you aren't doing that right now.

i wanted to make sure that only the processes in the specified cgroup are affected by the lsm hook, and other processes should be able to perform actions without any issues. Maybe a bug might be accidentally introduced in the lsm_cgroup implementation in ebpf vm runtime that might cause it to get executed against all workloads, which is very very less likely to happen. But i thought this might not hurt to check


test/integration-test/src/tests/lsm.rs line 50 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

how about write!(&mut f, "{pid}")?

done


test/integration-test/src/tests/lsm.rs line 59 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

do we need this assertion? probably all we care about is that it didn't error

done


test/integration-test/src/tests/lsm.rs line 80 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

we should also do this before attaching the program.

done

@altugbozkurt07 altugbozkurt07 force-pushed the lsm_cgroup_api branch 2 times, most recently from 703c053 to 11fc9e4 Compare January 31, 2025 13:17
Copy link
Member
@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 4 files at r4.
Reviewable status: 20 of 23 files reviewed, 5 unresolved discussions (waiting on @alessandrod and @altugbozkurt07)


-- commits line 2 at r3:

Previously, altugbozkurt07 wrote…

done

please remove the space before the colon


test/integration-test/src/tests/lsm.rs line 42 at r3 (raw file):

Previously, altugbozkurt07 wrote…

i wanted to make sure that only the processes in the specified cgroup are affected by the lsm hook, and other processes should be able to perform actions without any issues. Maybe a bug might be accidentally introduced in the lsm_cgroup implementation in ebpf vm runtime that might cause it to get executed against all workloads, which is very very less likely to happen. But i thought this might not hurt to check

It is not our job to test the ebpf VM.


xtask/src/run.rs line 416 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Please add a comment.

This comment is not sufficient. Please explain why, not what.

Copy link
Author
@altugbozkurt07 altugbozkurt07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 15 of 23 files reviewed, 5 unresolved discussions (waiting on @alessandrod and @tamird)


-- commits line 2 at r3:

Previously, tamird (Tamir Duberstein) wrote…

please remove the space before the colon

done


test/integration-test/src/tests/lsm.rs line 42 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

It is not our job to test the ebpf VM.

okay updated the test accordingly


xtask/src/run.rs line 416 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

This comment is not sufficient. Please explain why, not what.

done

@tamird
Copy link
Member
tamird commented May 8, 2025

I'd like @dave-tucker to have a final look.

Copy link
Member
@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small changes required and then this should be good to go! Thanks for persevering with this @altugbozkurt07

@@ -0,0 +1,108 @@
//! LSM probes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've modelled this differently from LSM, this should probably be:

//! LSM Cgroup Probes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

let prog_fd = self.fd()?;
let prog_fd = prog_fd.as_fd();
let cgroup_fd = cgroup.as_fd();
let attach_type = self.data.expected_attach_type.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't safely unwrap here.
You have 2 choices:

  1. Implement from_pin manually and set the self.data.expected_attach_type appropriately. Then add a comment along the lines of:
            // Unwrap safety: the function starts with `self.fd()?` that will succeed if and only
            // if the program has been loaded, i.e. there is an fd. That happens if:
            // - LsmCgroup::load has been called
            // - LsmCgroup::from_pin has been called
            //
            // In all cases, expected_attach_type is guaranteed to be Some(_).
  1. Use Some(BPF_LSM_CGROUP)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing the comment I provided ☝️ re: unwrap safety.
Without that, its not clear why it's safe to unwrap in that codepath.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i implemented the from_pin method to make sure expected_attach_type is guaranteed to be Some(_) and added that comment lines on on top of unwrap. Is that good to go now? or anything else i missed in there ?

@altugbozkurt07 altugbozkurt07 force-pushed the lsm_cgroup_api branch 2 times, most recently from f1d26b7 to bd945db Compare May 8, 2025 21:21
@altugbozkurt07 altugbozkurt07 requested a review from dave-tucker May 8, 2025 21:21
Copy link
Member
@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r25, all commit messages.
Reviewable status: 22 of 24 files reviewed, 2 unresolved discussions (waiting on @alessandrod, @altugbozkurt07, and @dave-tucker)

@tamird
Copy link
Member
tamird commented May 12, 2025

Seems there are lint failures.

Copy link
Member
@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dave-tucker
Copy link
Member

@altugbozkurt07 once you fix the lint issues I think this should be good to go.

};

#[test]
#[ignore = "LSM programs need a special kernel config, which is not supported by GitHub runners[waiting on PR: 1063]."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can be addressed now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually updated the tests accordingly but it seems to still fail @tamird can you take a look pls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that is_program_supported is a sufficient check. See

// For most types, `EINVAL` typically indicates it is not supported.
// However, further examination is required for tracing, extension, and lsm.
Some(EINVAL) => {
// At this point for tracing, extension, and lsm, loading failed due to unset
// `attach_btf_id`, so we examine verifier log for the target message. The
// message originated from `check_attach_btf_id()`[0] in v5.5 to v5.9, then
// moved to `bpf_check_attach_target()`[1] in v5.10 and onward.
//
// If target message is present in the logs, then loading process has reached
// up to the verifier section, which indicates that the kernel is at least
// aware of the program type variants.
//
// If the verifier log is empty, then it was immediately rejected by the
// kernel, meaning the types are not supported.
//
// [0] https://elixir.bootlin.com/linux/v5.5/source/kernel/bpf/verifier.c#L9535
// [1] https://elixir.bootlin.com/linux/v5.9/source/kernel/bpf/verifier.c#L10849
let supported = matches!(
verifier_log,
Some(verifier_log) if verifier_log.starts_with(b"Tracing programs must provide btf_id")
);
Ok(supported)
}

and

// `ENOTSUPP` from `check_struct_ops_btf_id()`[0] indicates that it reached the
// verifier section, meaning the kernel is at least aware of the type's existence.
//
// Otherwise, it will produce `EINVAL`, meaning the type is immediately rejected
// and does not exist.
//
// [0] https://elixir.bootlin.com/linux/v5.6/source/kernel/bpf/verifier.c#L9740
Some(524) if program_type == ProgramType::StructOps => Ok(true),

perhaps @tyrone-wu can help us with the proper check.

@altugbozkurt07 altugbozkurt07 force-pushed the lsm_cgroup_api branch 3 times, most recently from 3b407e2 to 5b606ad Compare May 23, 2025 12:41
@tamird tamird requested a review from dave-tucker May 24, 2025 16:43
Copy link
Member
@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r26, 7 of 7 files at r27, all commit messages.
Reviewable status: 22 of 24 files reviewed, 7 unresolved discussions (waiting on @alessandrod, @dave-tucker, and @tyrone-wu)


test/integration-test/src/tests/lsm.rs line 48 at r26 (raw file):

    assert_matches::assert_matches!(TcpListener::bind("127.0.0.1:12345"), Ok(_));

nit: i'd restore this blank line


test/integration-test/src/tests/lsm.rs line 21 at r27 (raw file):

        eprintln!("skipping lsm_cgroup test on kernel {kernel_version:?}");
        return;
    }

can we remove this now that we have a feature check instead?

Code quote:

    let kernel_version = KernelVersion::current().unwrap();
    if kernel_version < KernelVersion::new(6, 0, 0) {
        eprintln!("skipping lsm_cgroup test on kernel {kernel_version:?}");
        return;
    }

test/integration-test/src/tests/lsm.rs line 23 at r27 (raw file):

    }

    if !(is_program_supported(aya::programs::ProgramType::Lsm).unwrap()) {

too many parens?


test/integration-test/src/tests/lsm.rs line 77 at r27 (raw file):

    }

    if !(is_program_supported(aya::programs::ProgramType::Lsm).unwrap()) {

too many parens?

};

#[test]
#[ignore = "LSM programs need a special kernel config, which is not supported by GitHub runners[waiting on PR: 1063]."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that is_program_supported is a sufficient check. See

// For most types, `EINVAL` typically indicates it is not supported.
// However, further examination is required for tracing, extension, and lsm.
Some(EINVAL) => {
// At this point for tracing, extension, and lsm, loading failed due to unset
// `attach_btf_id`, so we examine verifier log for the target message. The
// message originated from `check_attach_btf_id()`[0] in v5.5 to v5.9, then
// moved to `bpf_check_attach_target()`[1] in v5.10 and onward.
//
// If target message is present in the logs, then loading process has reached
// up to the verifier section, which indicates that the kernel is at least
// aware of the program type variants.
//
// If the verifier log is empty, then it was immediately rejected by the
// kernel, meaning the types are not supported.
//
// [0] https://elixir.bootlin.com/linux/v5.5/source/kernel/bpf/verifier.c#L9535
// [1] https://elixir.bootlin.com/linux/v5.9/source/kernel/bpf/verifier.c#L10849
let supported = matches!(
verifier_log,
Some(verifier_log) if verifier_log.starts_with(b"Tracing programs must provide btf_id")
);
Ok(supported)
}

and

// `ENOTSUPP` from `check_struct_ops_btf_id()`[0] indicates that it reached the
// verifier section, meaning the kernel is at least aware of the type's existence.
//
// Otherwise, it will produce `EINVAL`, meaning the type is immediately rejected
// and does not exist.
//
// [0] https://elixir.bootlin.com/linux/v5.6/source/kernel/bpf/verifier.c#L9740
Some(524) if program_type == ProgramType::StructOps => Ok(true),

perhaps @tyrone-wu can help us with the proper check.

@tyrone-wu
Copy link
Contributor
tyrone-wu commented May 25, 2025

For lsm test, just is_program_supported should be alright since BPF_LSM_MAC and LSM prog type were added together in v5.7.

The lsm_cgroup checks seem sufficient since BPF_LSM_CGROUP was added later in v6.0.

I did some digging, and I believe the local test is failing because bpf lsm isn't fully "activated" lockc-project/lockc#159. I verified that CONFIG_BPF_LSM=y which explains why it was able to loading and attaching without error, but it seems that bpf needs to be added to the LSM list for it to come into effect. I don't see a proper way to enable it in the gh runner, so you may need to add another check if it can't be added.
The qemu vm tests on my end is passing, just not local.

@altugbozkurt07
Copy link
Author

yes the gh x86 runners dont support nested virtualization. @tamird should i manually parse the boot config and check if bpf lsm is enabled or how should i proceed in this case ?

@tyrone-wu
Copy link
Contributor
tyrone-wu commented Jun 1, 2025

I think checking /sys/kernel/security/lsm might be easier for this specific case.

I'm not sure what the process is, but I wonder if we can request github to enable bpf lsm in their runners.

@altugbozkurt07
Copy link
Author

@tamird, i could write a small helper function to check if bpf is enabled in lsm config through sysfs api. Should i put this function in the lsm test file or i guess this could be a generic utility function that can be re-used across the code, in that case should it go down in feature probes file. @tyrone-wu thanks for your insights!!

@tamird
Copy link
Member
tamird commented Jun 2, 2025

@tamird, i could write a small helper function to check if bpf is enabled in lsm config through sysfs api. Should i put this function in the lsm test file or i guess this could be a generic utility function that can be re-used across the code, in that case should it go down in feature probes file. @tyrone-wu thanks for your insights!!

I think it's fine to put it in the test for now. It can move out of there when someone needs it.

@altugbozkurt07
Copy link
Author

@tamird , hi, can you check what is wrong with my latest changes ?

@tamird
Copy link
Member
tamird commented Jun 12, 2025

Surely you can read the logs?

---- tests::lsm::lsm stdout ----

thread 'tests::lsm::lsm' panicked at test/integration-test/src/tests/lsm.rs:17:10:
called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::lsm::lsm_cgroup stdout ----

thread 'tests::lsm::lsm_cgroup' panicked at test/integration-test/src/tests/lsm.rs:17:10:
called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }


failures:
    tests::lsm::lsm
    tests::lsm::lsm_cgroup

Looks like /sys/kernel/security/lsm doesn't exist and your tests are crashing trying to read it.

@altugbozkurt07
Copy link
Author

sorry for that, i misread the message its been a long week for me. I guess the presense of this file depends on a kernel boot configuration ? CONFIG_SECURITY needs to be enabled in the boot config for this files presence, though i am not sure. How should i proceed @tamird ?

@tamird
Copy link
Member
tamird commented Jun 13, 2025

Handle the error?

@altugbozkurt07
Copy link
Author

i could surely check if file exists before trying to read from it but i was trying to ask what i should do if it does not exist, should i simply conclude lsm is not supported if that file does not exist? @tamird

@tamird
Copy link
Member
tamird commented Jun 14, 2025

Doesn't sound like a sensible approach since it doesn't exist in any of the virtualized tests either, meaning we'd not be running these new tests at all. Perhaps @dave-tucker knows if we can enable this somehow? ChatGPT says sudo mount -t securityfs securityfs /sys/kernel/security is a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/needs-review Makes an API change that needs review aya This is about aya (userspace) aya-bpf This is about aya-bpf (kernel) aya-obj Relating to the aya-obj crate test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0