Description
# Test environment
Linux localhost 5.4.80-geec5f34899c0 #3 SMP PREEMPT Mon Nov 1 04:55:56 CST 2021 armv8l GNU/Linux
Android with 5.4 kernel on qual-core ARM cortex-A55 platform
Hi,
While using the latest tracepoint based hardirqs on arm64 platform with Linux kernel 5.4, I have encountered an issue that there are some irq event have duplicated output, take an example below.
(bcc)root@localhost:/# hardirqs 10 1
Tracing hard irq event time... Hit Ctrl-C to end.
HARDIRQ TOTAL_usecs
rtk-ir 53
v_rpc 198
rtk-ir 361
a_rpc 1235
a_rpc 2566
dc2vo 3686
EMMC 5743
dc2vo 8056
EMMC 10051
eth0 11720
eth0 12816
98007d00.i2c_0 16919
981d0000.gpu 33344
arch_timer 136990
arch_timer 327240
I made some investigation on this, and found this is caused by current tracepoint based hardirqs uses irq name as key to store data, and the irq name is got via calling TP_DATA_LOC_READ_CONST(&key.name, name, sizeof(key.name));
.
Line 88 in 60e0de9
No matter how long the irq name string is, BPF program always copies 32 bytes to local storage and uses it as key to update map.
Unfortunately, when kernel code call trace_irq_handler_entry(irq, action)
to send out trace event, it uses __assign_str
to update the irq name.
TRACE_EVENT(irq_handler_entry,
TP_PROTO(int irq, struct irqaction *action),
TP_ARGS(irq, action),
TP_STRUCT__entry(
__field( int, irq )
__string( name, action->name )
),
TP_fast_assign(
__entry->irq = irq;
__assign_str(name, action->name);
),
TP_printk("irq=%d name=%s", __entry->irq, __get_str(name))
);
https://elixir.bootlin.com/linux/v5.4.34/source/include/trace/events/irq.h#L66
At the end, it uses strcpy
to copy the string.
#define __assign_str(dst, src) \
strcpy(__get_str(dst), (src) ? (const char *)(src) : "(null)");
https://elixir.bootlin.com/linux/v5.4.34/source/include/trace/trace_events.h#L673
Take my case for example, the duplicated irq name a_rpc, one is the correct a_rpc, and another one is a_rpc\0\0\0..._0 in name[32] used for map key.
The extra _0 is the no meaning data from last 98007d00.i2c_0 event, because strcpy only copied a_rpc\0, and the rest data should be ignored, but BPF program can't know the exactly string length of irq name before copy it.
I am not sure why this issue won't happen on x86-64 platform, but I am wondering is there any better way to deal with this?
Could you please give me some advice for this? Thank you. :)