-
Notifications
You must be signed in to change notification settings - Fork 449
Introducing decoder pkg #1405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing decoder pkg #1405
Conversation
e3f1515
to
e39a68d
Compare
09504db
to
c502272
Compare
ff912c9
to
1375f79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome @mtcherni95!
It seems like this will be a big improvement to performance
func TestContextSize(t *testing.T) { | ||
var v Context | ||
assert.Equal(t, int(v.GetSizeBytes()), 104) | ||
} | ||
func TestChunkMetaSize(t *testing.T) { | ||
var v ChunkMeta | ||
|
||
assert.Equal(t, int(v.GetSizeBytes()), 45) | ||
} | ||
|
||
func TestVfsWriteMetaSize(t *testing.T) { | ||
var v VfsWriteMeta | ||
assert.Equal(t, int(v.GetSizeBytes()), 20) | ||
} | ||
|
||
func TestKernelModuleMetaSize(t *testing.T) { | ||
var v KernelModuleMeta | ||
assert.Equal(t, int(v.GetSizeBytes()), 24) | ||
} | ||
|
||
func TestMprotectWriteMetaSize(t *testing.T) { | ||
var v MprotectWriteMeta | ||
assert.Equal(t, int(v.GetSizeBytes()), 8) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of bytes of each struct is hardcoded here, as well as in the GetSizeBytes() function, so do these tests really test something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanivagman please see this #1405 (comment)
If agree, then in the tests I will use unsafe.Sizeof to get exact size of the struct and assert it's equal to the hardcoded value.
ctx.Retval = int64(binary.LittleEndian.Uint64(decoder.buffer[offset+88 : offset+96])) | ||
ctx.StackID = binary.LittleEndian.Uint32(decoder.buffer[offset+96 : offset+100]) | ||
ctx.Argnum = uint8(binary.LittleEndian.Uint16(decoder.buffer[offset+100 : offset+102])) | ||
decoder.cursor += ctx.GetSizeBytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need all of those GetSizeBytes() functions? To me it seems that they give us no real benefit as anyway the exact bytes offsets are hardcoded here. In addition, we will have to maintain those functions correct whenever we update one of the structs, increasing the chances of future programming mistakes.
If the problem it solves is padding (for example context is of size 104 while we read till byte 102 here), we can just add another field to the relevant structs of padding (for example, two more bytes for padding at the end of Context).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I agree with @yanivagman. I just left a comment about the hard coded values asking if we could get the arg type's sizes during initialization (using reflect type size or something like that) and have them constant... even if we need to adjusting padding on the affected structs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanivagman , I'm fine with removing the GetSizeBytes()
functions. I wouldn't add additional data to the structs just for expose the size of it. I would simply add it hard-coded.
@rafaeldtinoco I prefer not to use reflection as it negatively affects efficiency (binary.Read
uses it, and that's the whole issue with it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could still have GetSizeBytes()
which returns an hardcoded value, and in the tests (https://github.com/aquasecurity/tracee/pull/1405/files#diff-242d9e577032010728b6349d850cb4bcb8c517923dd77d50ad684732b43c23efR31) we use unsafe.Sizeof
to calcualte the size struct (just need to take into account pad added by compiler, when the struct is created, to make it a multiple of 8 bytes). In this way: the test is meaningful as it will fail if someone changes the structs without updating the decoder, and the GetSizeBytes()
has its purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafaeldtinoco I prefer not to use reflection as it negatively affects efficiency (
binary.Read
uses it, and that's the whole issue with it).
Yep, it was just a way to say something else than hardcoded in here, possibly calculated as const in the startup. Does not necessarily need to be reflect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=== RUN TestVfsWriteMetaSize
protocol_test.go:31:
Error Trace: protocol_test.go:31
Error: Not equal:
expected: 28
actual : 20
Test: TestVfsWriteMetaSize
--- FAIL: TestVfsWriteMetaSize (0.00s)
=== RUN TestKernelModuleMetaSize
--- PASS: TestKernelModuleMetaSize (0.00s)
=== RUN TestMprotectWriteMetaSize
--- PASS: TestMprotectWriteMetaSize (0.00s)
FAIL
FAIL github.com/aquasecurity/tracee/tracee-ebpf/tracee/internal/bufferdecoder 0.005s
After adding a variable to the arg type. I'm fine with that, since the PR checks will get the issue and allow the fix to happen (and I can understand why hard coding the size has better performance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtcherni95 Overall I think this is a very nice change and I like the way it has been done with minimal changes to the parsing logic, etc. Left some comments on top of Yaniv's review, which I think address same concerns I had mostly (and some other nit fixes).
ctx.Retval = int64(binary.LittleEndian.Uint64(decoder.buffer[offset+88 : offset+96])) | ||
ctx.StackID = binary.LittleEndian.Uint32(decoder.buffer[offset+96 : offset+100]) | ||
ctx.Argnum = uint8(binary.LittleEndian.Uint16(decoder.buffer[offset+100 : offset+102])) | ||
decoder.cursor += ctx.GetSizeBytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I agree with @yanivagman. I just left a comment about the hard coded values asking if we could get the arg type's sizes during initialization (using reflect type size or something like that) and have them constant... even if we need to adjusting padding on the affected structs...
fb08aaf
to
0ea2b74
Compare
c7907bf
to
22b50b3
Compare
assert.Equal(t, nil, err) | ||
// checking decoding succeeded correctly B41A | ||
assert.Equal(t, ctxExpected, ctxObtained) | ||
// checking decoder cursor on buffer moved approiately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spelling of approiately (multiple spelling issues because of copy/paste)
ctx.Retval = int64(binary.LittleEndian.Uint64(decoder.buffer[offset+88 : offset+96])) | ||
ctx.StackID = binary.LittleEndian.Uint32(decoder.buffer[offset+96 : offset+100]) | ||
ctx.Argnum = uint8(binary.LittleEndian.Uint16(decoder.buffer[offset+100 : offset+102])) | ||
decoder.cursor += ctx.GetSizeBytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=== RUN TestVfsWriteMetaSize
protocol_test.go:31:
Error Trace: protocol_test.go:31
Error: Not equal:
expected: 28
actual : 20
Test: TestVfsWriteMetaSize
--- FAIL: TestVfsWriteMetaSize (0.00s)
=== RUN TestKernelModuleMetaSize
--- PASS: TestKernelModuleMetaSize (0.00s)
=== RUN TestMprotectWriteMetaSize
--- PASS: TestMprotectWriteMetaSize (0.00s)
FAIL
FAIL github.com/aquasecurity/tracee/tracee-ebpf/tracee/internal/bufferdecoder 0.005s
After adding a variable to the arg type. I'm fine with that, since the PR checks will get the issue and allow the fix to happen (and I can understand why hard coding the size has better performance).
59c2d95
to
d239924
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
d239924
to
ab5807e
Compare
Hi, after running some benchmarks on Tracee @NDStrahilevitz and I saw that the processEvents logic was using a lot of computational time in binary.Read() function. Please see flame graph:
After exploring the internals of binary.Read() we saw it uses the
reflect
package which compromises program efficiency.Moreover, from
binary
package description: "[...] This package favors simplicity over efficiency. Clients that require high-performance serialization, especially for large data structures should look at more advanced solutions [...]".The intent of this PR is to maximize efficiency over simplicity.
A
decoder
package is introduced.Similarly to
binary
, thedecoder
knows to read a sequence of bytes coming from the Tracee eBPF program to user-space and translate them to golang structs defined under theprotocol
package.Here is a screenshot from a benchmark tests that compares the efficiency of the new

decoder
package vsbinary.Read
operations.The
decoder
package translation scope is much smaller than the one ofbinary.Read
, indeedbinary.Read
accept an interface while thedecoder
translate only to specific data structures. Howeverdecoder
is much (much!) more efficient.Solves: #1407