[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZCz+sLuAUF65SaHqPUemsUb0WBhAhLYoaAs54VfH1V2w@mail.gmail.com>
Date: Fri, 26 Jul 2024 17:12:06 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Leon Hwang <hffilwlqm@...il.com>
Cc: Yonghong Song <yonghong.song@...ux.dev>, Zheao Li <me@...jusaka.me>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog
method to output failure logs to kernel
On Thu, Jul 25, 2024 at 7:57 PM Leon Hwang <hffilwlqm@...il.com> wrote:
>
>
>
> On 26/7/24 05:27, Andrii Nakryiko wrote:
> > On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@...il.com> wrote:
> >>
> >>
> >>
> >> On 25/7/24 14:09, Yonghong Song wrote:
> >>>
> >>> On 7/24/24 11:05 PM, Leon Hwang wrote:
> >>>>
> >>>> On 25/7/24 13:54, Yonghong Song wrote:
> >>>>> On 7/24/24 10:15 PM, Zheao Li wrote:
> >>>>>> This is a v2 patch, previous Link:
> >>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@manjusaka.me/T/#u
> >>>>>>
>
> [SNI]
>
[...]
> >
>
> Build and run, sudo ./retsnoop -e verbose -e bpf_log -e
> bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output:
>
>
> FUNCTION CALLS RESULT DURATION ARGS
> -------------- ------ -------- ----
> ↔ bpf_log [void] 1.350us log=NULL fmt='%s() is not a global
> function ' =(vararg)
>
> It's great to show arguments.
>
Thanks for repro steps, they worked. Also, I just pushed latest
retsnoop version to Github that does support capturing vararg
arguments for printf-like functions. See full debugging log at [0],
but I basically did just two things:
$ sudo retsnoop -e '*sys_bpf' --lbr -n freplace
-n freplace filters by process name, to avoid the noise. I traced
bpf() syscall (*sys_bf), and I requested function call LBR (Last
Branch Record) stack. LBR showed that we have
bpf_prog_attach_check_attach_type() call, and then eventually we get
to bpf_log().
So I then traced bpf_log (no --lbr this time, but I requested function
trace + arguments capture:
$ sudo retsnoop -n freplace -e '*sys_bpf' -a bpf_log -TA
17:02:39.968302 -> 17:02:39.968307 TID/PID 2730863/2730855 (freplace/freplace):
FUNCTION CALLS RESULT DURATION ARGS
----------------- --------- -------- ----
→ __x64_sys_bpf
regs=&{.r15=2,.r14=0xc0000061c0,.bp=0xc00169f8a8,.bx=28,.r11=514,.ax=0xffffffffffffffda,.cx=0x404f4e,.dx=64,.si=0xc00169fa10…
→ __sys_bpf cmd=28
uattr={{.kernel=0xc00169fa10,.user=0xc00169fa10}} size=64
↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a
global function ' vararg0='stub_handler_static'
← __sys_bpf [-EINVAL] 4.115us
← __x64_sys_bpf [-EINVAL] 5.467us
For __x64_sys_bpf that's struct pt_regs, which isn't that interesting,
but then we have:
↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a global
function ' vararg0='stub_handler_static'
Which showed format string and the argument passed to it:
'stub_hanler_static' subprogram seems to be the problem here.
Anyways, tbh, for a problem like this, it's probably best to just
request a verbose log when doing the BPF_PROG_LOAD command. You can
*normally* use veristat tool to get that easily, if you have a .bpf.o
object file on the disk. But in this case it's freplace and veristat
doesn't know what's the target BPF program, so it's not that useful in
this case:
$ sudo veristat -v freplace_bpfel.o
Processing 'freplace_bpfel.o'...
libbpf: prog 'freplace_handler': attach program FD is not set
libbpf: prog 'freplace_handler': failed to prepare load attributes: -22
libbpf: prog 'freplace_handler': failed to load: -22
libbpf: failed to load object 'freplace_bpfel.o'
PROCESSING freplace_bpfel.o/freplace_handler, DURATION US: 0, VERDICT:
failure, VERIFIER LOG:
File Program Verdict Duration (us) Insns
States Peak states
---------------- ---------------- ------- ------------- -----
------ -----------
freplace_bpfel.o freplace_handler failure 0 0
0 0
---------------- ---------------- ------- ------------- -----
------ -----------
Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
But for lots of other programs this would be a no-brainer.
[0] https://gist.github.com/anakryiko/88a1597a68e43dc945e40fde88a96e7e
[...]
>
> Is it OK to add a tracepoint here? I think tracepoint is more generic
> than retsnoop-like way.
I personally don't see a problem with adding tracepoint, but how would
it look like, given we are talking about vararg printf-style function
calls? I'm not sure how that should be represented in such a way as to
make it compatible with tracepoints and not cause any runtime
overhead.
>
> Thanks,
> Leon
>
>
Powered by blists - more mailing lists