lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 04 May 2015 22:49:01 -0700 From: Alexei Starovoitov <ast@...mgrid.com> To: Wang Nan <wangnan0@...wei.com>, davem@...emloft.net, acme@...nel.org, mingo@...hat.com, a.p.zijlstra@...llo.nl, masami.hiramatsu.pt@...achi.com, jolsa@...nel.org CC: linux-kernel@...r.kernel.org, pi3orama@....com, hekuang@...wei.com, bgregg@...flix.com Subject: Re: [RFC PATCH 00/22] perf tools: introduce 'perf bpf' command to load eBPF programs. On 5/4/15 9:41 PM, Wang Nan wrote: > > That's great. Could you please append the description of 'llvm -s' into your README > or comments? It has cost me a lot of time for dumping eBPF instructions so I decide to > add it into perf... sure. it's just -filetype=asm flag to llc instead of -filetype=obj. Eventually it will work as normal 'clang -S file.c' when few more llvm commits are accepted upstream. >>> My collage He Kuang is working on variable accessing. Probing inside function body >>> and accessing its local variable will be supported like this: >>> >>> SEC("config") char _prog_config[] = "prog: func_name:1234 vara=localvara" >>> int prog(struct pt_regs *ctx, unsigned long vara) { >>> // vara is the value of localvara of function func_name >>> } >> >> that would be great. I'm not sure though how you can achieve that >> without changing C front-end ? > > It's not very difficult. He is trying to generate the loader of vara > as prologue, then paste the prologue and the main eBPF program together. > From the viewpoint of kernel bpf verifier, there is only one param (ctx); the > prologue program fetches the value of vara then put it into a propoer register, > then main program work. got it. I think that's much cleaner than what I was proposing. The only question is then: char _prog_config[] = "prog: func_name:1234 vara=localvara" should actually be something like "... r2=localvara", right? since prologue would need to assign into r2. Otherwise I don't see where you find out about 'vara' inside compiled bpf code. Would be nice if this can be done without debug info. Like in tracex2_kern.c I have: SEC("kprobe/sys_write") int bpf_prog(struct pt_regs *ctx) { long wr_size = ctx->dx; /* arg3 */ with your prolog generator the above can be rewritten as: SEC("kprobe/sys_write") int bpf_prog(struct pt_regs *unused, int fd, char *buf, size_t wr_size) { /* use wr_size */ that will improve ease of use a lot. > Another possible solution is to change the protocol between kprobe and eBPF > program, makes kprobes calls fetchers and passes them to eBPF program as > a second param (group all varx together). > A prologue may still need in this case to load each param into correct > register. you mean grouping varx together in some other struct and embedding it together with pt_regs into new container struct? doable, but your first approach is quite clean already. why bother. > Could you please consider the following problem? > > We find there are serval __lock_page() calls last very long time. We are going > to find corresponding __unlock_page() so we can know what blocks them. We want to > insert eBPF programs before io_schedule() in __lock_page(), and also add eBPF program > on the entry of __unlock_page(), so we can compute the interval between page locking and > unlocking. If time is longer than a threshold, let __unlock_page() trigger a perf sampling > so we get its call stack. In this case, eBPF program acts as a trace filter. all makes sense and your use case fits quite well into existing bpf+kprobe model. I'm not sure why you're calling a 'problem'. A problem of how to display that call stack from perf? I would say it fits better as a sample than a trace. If you dump it as a trace, it won't easy to decipher, whereas if you treat it a sampling event, perf record/report facility will pick it up and display nicely. Meaning that one sample == lock_page/unlock_page latency > N. Then existing sample_callchain flag should work. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists