lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 5 Mar 2022 16:28:17 -0800 From: Yonghong Song <yhs@...com> To: Namhyung Kim <namhyung@...nel.org>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org> Cc: Martin KaFai Lau <kafai@...com>, Song Liu <songliubraving@...com>, John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, netdev@...r.kernel.org, bpf@...r.kernel.org, Eugene Loh <eugene.loh@...cle.com>, Peter Zijlstra <peterz@...radead.org>, Hao Luo <haoluo@...gle.com> Subject: Re: [RFC] A couple of issues on BPF callstack On 3/4/22 3:28 PM, Namhyung Kim wrote: > Hello, > > While I'm working on lock contention tracepoints [1] for a future BPF > use, I found some issues on the stack trace in BPF programs. Maybe > there are things that I missed but I'd like to share my thoughts for > your feedback. So please correct me if I'm wrong. > > The first thing I found is how it handles skipped frames in the > bpf_get_stack{,id}. Initially I wanted a short stack trace like 4 > depth to identify callers quickly, but it turned out that 4 is not > enough and it's all filled with the BPF code itself. > > So I set to skip 4 frames but it always returns an error (-EFAULT). > After some time I figured out that BPF doesn't allow to set skip > frames greater than or equal to buffer size. This seems strange and > looks like a bug. Then I found a bug report (and a partial fix) [2] > and work on a full fix now. Thanks for volunteering. Looking forward to the patch. > > But it revealed another problem with BPF programs on perf_event which > use a variant of stack trace functions. The difference is that it > needs to use a callchain in the perf sample data. The perf callchain > is saved from the begining while BPF callchain is saved at the last to > limit the stack depth by the buffer size. But I can handle that. > > More important thing to me is the content of the (perf) callchain. If > the event has __PERF_SAMPLE_CALLCHAIN_EARLY, it will have context info > like PERF_CONTEXT_KERNEL. So user might or might not see it depending > on whether the perf_event set with precise_ip and SAMPLE_CALLCHAIN. > This doesn't look good. Patch 7b04d6d60fcf ("bpf: Separate bpf_get_[stack|stackid] for perf events BPF") tried to fix __PERF_SAMPLE_CALLCHAIN_EARLY issue for bpf_get_stack[id]() helpers. The helpers will check whether event->attr.sample_type has __PERF_SAMPLE_CALLCHAIN_EARLY encoded or not, based on which the stacks will be retrieved accordingly. Did you any issue here? > > After all, I think it'd be really great if we can skip those > uninteresting info easily. Maybe we could add a flag to skip BPF code We cannot just skip those callchains with __PERF_SAMPLE_CALLCHAIN_EARLY. There are real use cases for it. > perf context, and even some scheduler code from the trace respectively > like in stack_trace_consume_entry_nosched(). A flag for the bpf_get_stack[id]() helpers? It is possible. It would be great if you can detail your use case here and how a flag could help you. > > Thoughts? > > Thanks, > Namhyung > > > [1] https://lore.kernel.org/all/20220301010412.431299-1-namhyung@kernel.org/ > [2] https://lore.kernel.org/bpf/30a7b5d5-6726-1cc2-eaee-8da2828a9a9c@oracle.com/
Powered by blists - more mailing lists