lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZhsQeDn8biUnt9WXt6RVcW_PPX76YFyZo6CjEXGKTdDg@mail.gmail.com>
Date: Mon, 29 Jul 2024 14:01:20 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Leon Hwang <hffilwlqm@...il.com>
Cc: Yonghong Song <yonghong.song@...ux.dev>, Zheao Li <me@...jusaka.me>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, 
	Jiri Olsa <jolsa@...nel.org>, bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog
 method to output failure logs to kernel

On Fri, Jul 26, 2024 at 9:04 PM Leon Hwang <hffilwlqm@...il.com> wrote:
>
>
>
> On 2024/7/27 08:12, Andrii Nakryiko wrote:
> > On Thu, Jul 25, 2024 at 7:57 PM Leon Hwang <hffilwlqm@...il.com> wrote:
> >>
> >>
> >>
> >> On 26/7/24 05:27, Andrii Nakryiko wrote:
> >>> On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@...il.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 25/7/24 14:09, Yonghong Song wrote:
> >>>>>
> >>>>> On 7/24/24 11:05 PM, Leon Hwang wrote:
> >>>>>>
> >>>>>> On 25/7/24 13:54, Yonghong Song wrote:
> >>>>>>> On 7/24/24 10:15 PM, Zheao Li wrote:
> >>>>>>>> This is a v2 patch, previous Link:
> >>>>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@manjusaka.me/T/#u
> >>>>>>>>
> >>
> >> [SNI]
> >>
> >
> > [...]
> >
> >>>
> >>
> >> Build and run, sudo ./retsnoop -e verbose -e bpf_log -e
> >> bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output:
> >>
> >>
> >> FUNCTION CALLS   RESULT  DURATION  ARGS
> >> --------------   ------  --------  ----
> >> ↔ bpf_log        [void]   1.350us  log=NULL fmt='%s() is not a global
> >> function ' =(vararg)
> >>
> >> It's great to show arguments.
> >>
> >
> > Thanks for repro steps, they worked. Also, I just pushed latest
> > retsnoop version to Github that does support capturing vararg
> > arguments for printf-like functions. See full debugging log at [0],
> > but I basically did just two things:
> >
> > $ sudo retsnoop -e '*sys_bpf' --lbr -n freplace
> >
> > -n freplace filters by process name, to avoid the noise. I traced
> > bpf() syscall (*sys_bf), and I requested function call LBR (Last
> > Branch Record) stack. LBR showed that we have
> > bpf_prog_attach_check_attach_type() call, and then eventually we get
> > to bpf_log().
> >
> > So I then traced bpf_log (no --lbr this time, but I requested function
> > trace + arguments capture:
> >
> > $ sudo retsnoop -n freplace -e '*sys_bpf' -a bpf_log -TA
> >
> > 17:02:39.968302 -> 17:02:39.968307 TID/PID 2730863/2730855 (freplace/freplace):
> >
> > FUNCTION CALLS      RESULT     DURATION  ARGS
> > -----------------   ---------  --------  ----
> > → __x64_sys_bpf
> > regs=&{.r15=2,.r14=0xc0000061c0,.bp=0xc00169f8a8,.bx=28,.r11=514,.ax=0xffffffffffffffda,.cx=0x404f4e,.dx=64,.si=0xc00169fa10…
> >     → __sys_bpf                          cmd=28
> > uattr={{.kernel=0xc00169fa10,.user=0xc00169fa10}} size=64
> >         ↔ bpf_log   [void]      1.550us  log=NULL fmt='%s() is not a
> > global function ' vararg0='stub_handler_static'
> >     ← __sys_bpf     [-EINVAL]   4.115us
> > ← __x64_sys_bpf     [-EINVAL]   5.467us
> >
> >
> > For __x64_sys_bpf that's struct pt_regs, which isn't that interesting,
> > but then we have:
> >
> > ↔ bpf_log   [void]      1.550us  log=NULL fmt='%s() is not a global
> > function ' vararg0='stub_handler_static'
>
> It's awesome to show vararg.
>
> >
> > Which showed format string and the argument passed to it:
> > 'stub_hanler_static' subprogram seems to be the problem here.
> >
> >
> > Anyways, tbh, for a problem like this, it's probably best to just
> > request a verbose log when doing the BPF_PROG_LOAD command. You can
> > *normally* use veristat tool to get that easily, if you have a .bpf.o
> > object file on the disk. But in this case it's freplace and veristat
> > doesn't know what's the target BPF program, so it's not that useful in
> > this case:
> >
> > $ sudo veristat -v freplace_bpfel.o
> > Processing 'freplace_bpfel.o'...
> > libbpf: prog 'freplace_handler': attach program FD is not set
> > libbpf: prog 'freplace_handler': failed to prepare load attributes: -22
> > libbpf: prog 'freplace_handler': failed to load: -22
> > libbpf: failed to load object 'freplace_bpfel.o'
> > PROCESSING freplace_bpfel.o/freplace_handler, DURATION US: 0, VERDICT:
> > failure, VERIFIER LOG:
> >
> > File              Program           Verdict  Duration (us)  Insns
> > States  Peak states
> > ----------------  ----------------  -------  -------------  -----
> > ------  -----------
> > freplace_bpfel.o  freplace_handler  failure              0      0
> >  0            0
> > ----------------  ----------------  -------  -------------  -----
> > ------  -----------
> > Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
> >
> > But for lots of other programs this would be a no-brainer.
> >
> >
> >   [0] https://gist.github.com/anakryiko/88a1597a68e43dc945e40fde88a96e7e
> >
> > [...]
> >
> >>
> >> Is it OK to add a tracepoint here? I think tracepoint is more generic
> >> than retsnoop-like way.
> >
> > I personally don't see a problem with adding tracepoint, but how would
> > it look like, given we are talking about vararg printf-style function
> > calls? I'm not sure how that should be represented in such a way as to
> > make it compatible with tracepoints and not cause any runtime
> > overhead.
>
> The tracepoint is not about vararg printf-style function calls.
>
> It is to trace the reason why it fails to bpf_check_attach_target() at
> attach time.
>

Oh, that changes things. I don't think we can keep adding extra
tracepoints for various potential reasons that BPF prog might be
failing to verify.

But there is usually no need either. This particular code already
supports emitting extra information into verifier log, you just have
to provide that. This is done by libbpf automatically, can't your
library of choice do the same (if BPF program failed).

Why go to all this trouble if we already have a facility to debug
issues like this. Note every issue is logged into verifier log, but in
this case it is.

> So, let me introduce bpf_check_attach_target_with_tracepoint() and
> BPF_LOG_KERNEL_WITHOUT_PRINT. bpf_check_attach_target_with_tracepoint()
> is to call bpf_check_attach_target() and then to call
> trace_bpf_check_attach_target() if err. BPF_LOG_KERNEL_WITHOUT_PRINT is
> to avoid pr_err() in bpf_verifier_vlog().
>
> Here's the diff without trace_bpf_check_attach_target() definition:

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ