8000 tracee-rules: evaluate parsed input with OPA by danielpacak · Pull Request #829 · aquasecurity/tracee · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

tracee-rules: evaluate parsed input with OPA #829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2021
Merged

tracee-rules: evaluate parsed input with OPA #829

merged 1 commit into from
Jul 29, 2021

Conversation

danielpacak
Copy link
Contributor
@danielpacak danielpacak commented Jul 20, 2021

With OPA Go SDK we can significantly improve performance by passing
rego.EvalParsedInput instead of rego.EvalInput option to rego.Eval()
method.

Co-authored-by: Simar simar@linux.com
Signed-off-by: Simar simar@linux.com
Signed-off-by: Daniel Pacak pacak.daniel@gmail.com

@danielpacak danielpacak marked this pull request as ready for review July 23, 2021 15:34
@danielpacak danielpacak requested review from itaysk and simar7 July 23, 2021 15:36
@danielpacak
Copy link
Contributor Author

tracee_opa_eval_parsed_input

@itaysk
Copy link
Collaborator
itaysk commented Jul 25, 2021

@danielpacak you mention that this PR is just partially implementing the optimization, do you want to merge it, or it's draft until you add the part in consumeSources? If you want to add one by one, then I think the toParsedEvent and related functions should be taken out of test and made available to engine? (Also it has some overlap with #830 which I wasn't clear how you plan to reconcile). I'm approving anyway, just please share your plan, and possible open an issue to track it if you think it's necessary

@danielpacak
Copy link
Contributor Author

@danielpacak you mention that this PR is just partially implementing the optimization, do you want to merge it, or it's draft until you add the part in consumeSources? If you want to add one by one, then I think the toParsedEvent and related functions should be taken out of test and made available to engine? (Also it has some overlap with #830 which I wasn't clear how you plan to reconcile). I'm approving anyway, just please share your plan, and possible open an issue to track it if you think it's necessary

Hey @itaysk Sorry for the confusion. I believe we should merge #830 first because we do need ToUnstructured method to do any optimisation. Then I'll rebase this one and add missing code in consumeSources. Does it make sense?

@@ -166,6 +189,15 @@ func (engine *Engine) consumeSources(done <-chan bool) {
}
}

func (engine *Engine) dispatchEvent(s types.Signature, pe regosig.ParsedEvent, event types.Event) {
switch {
case strings.Contains(reflect.TypeOf(s).String(), "rego"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielpacak @simar7 why are we not using the type assert/switch here? I remember Daniel mentioned something the doesn't work but isn't this what we want? https://play.golang.org/p/en8jJCHWiqw

Copy link
Contributor Author
@danielpacak danielpacak Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our case is more like https://play.golang.org/p/wETpR604pdj

@itaysk I tried one more time using switch type and the following snippet seems to be working:

func (engine *Engine) dispatchEvent(s types.Signature, pe ParsedEvent) {
	switch s.(type) {
	case *regosig.RegoSignature: // Make sure to use pointer because we use pointer receiver for the RegoSignature implementation
		engine.signatures[s] <- pe
	default:
		engine.signatures[s] <- pe.Event
	}
}

However, it will introduce the cycle because of the engine.ParsedEvent type. We could fix that by moving the ParsedEvent to a shared package, maybe types package?

@simar7 Could you explain what was the advantage of using reflect.TypeOf(s) here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no advantage in this case. The only problem there is that we lose testability if we hardcode the type to *regosig.RegoSignature.

In the test for consumeSources we use a fake type called regoFakeSignature

type regoFakeSignature struct {
- When evaluating the following code, this will result in taking the default path instead which is not the desired behaviour.

func (engine *Engine) dispatchEvent(s types.Signature, pe ParsedEvent) {
	switch s.(type) {
	case *regosig.RegoSignature: // Note this isn't the same as regoFakeSignature
		engine.signatures[s] <- pe
	default:
		engine.signatures[s] <- pe.Event
	}
}

The other approach if we really wanted to use switch type assert would be to manually list each gosig as a case (since they are all defined types) and default for each regosig. So something like the following:

	switch s.(type) {
	case *stdioOverSocket, *antiDebugging: 
		engine.signatures[s] <- pe.Event
	default:
		engine.signatures[s] <- pe
	}
}

This doesn't scale well as it would require a change to this function every time a new Go signature is added/used.

For the above reasons I went with a reflect.TypeOf(s) instead. If I've missed something or if there is a better way to do it, happy to understand and discuss.

Copy link
Collaborator
@itaysk itaysk Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the only holdback from doing it using type assertion (assuming we agree it's better) is just the test, I'd sacrifice it and not he production code. But I think this is a symptom of a bigger problem, the fact that engine is knowing which concrete signature types it's invoking is breaking the entire premise of the Signature interface. I don't have a better suggestion yet, just wanted to bring this point to attention. @simar7 @danielpacak

@danielpacak danielpacak marked this pull request as draft July 27, 2021 09:43
@simar7
Copy link
Member
simar7 commented Jul 28, 2021

Observation:

While this is a step in the right direction, we still have more work to do. This is how the flamegraph looks today:
image

And this is how it looks after implementing this change:
image

As you can see the JSON round tripper was replaced with our InterfaceToValue call. While quantitatively this is a minor improvement, three additional tracks to work on can be identified as follows:

  1. 20% of total CPU: engine.signatureStart, primarily a track to optimize rego prepared queries and their eval.
  2. 25% of total CPU: runtime.gcBgMarkWorker, the goroutine over head track.
  3. 20% of total CPU: engine.consumeSources, event handling track.

With OPA Go SDK we can significantly improve performance by passing
rego.EvalParsedInput instead of rego.EvalInput option to rego.Eval()
method.

Co-authored-by: Simar <simar@linux.com>
Signed-off-by: Simar <simar@linux.com>
Signed-off-by: Daniel Pacak <pacak.daniel@gmail.com>
@danielpacak danielpacak changed the title feat: Evaluate parsed input with OPA tracee-rules: evaluate parsed input with OPA Jul 29, 2021
@danielpacak danielpacak marked this pull request as ready for review July 29, 2021 21:02
@danielpacak danielpacak requested review from simar7 and itaysk July 29, 2021 21:02
@simar7 simar7 merged commit 2d2845f into aquasecurity:main Jul 29, 2021
@danielpacak danielpacak deleted the opa_eval_parsed_input branch July 29, 2021 21:25
@itaysk
Copy link
Collaborator
itaysk commented Jul 31, 2021

thanks for working on merging this. do we have a sense on how much this improves by? I believe @simar7 told my ~10%, is that true? @danielpacak

@danielpacak
Copy link
Contributor Author

thanks for working on merging this. do we have a sense on how much this improves by? I believe @simar7 told my ~10%, is that true? @danielpacak

@itaysk Here are sample results from the benchmark of rego.Eval input versus rego.EvalParsedInput for 6 (open source) and 71 (closed source) signatures with 2 input events that trigger findings. Evaluating a prepared query seems to be 3-4 times faster. However, overall performance of the app depends on the workload, i.e. the stream of input events and how often an event is selected for evaluation (signatureIndex impl.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0