linux-kernel - Re: [PATCH RFC bpf-next 1/3] bpf: report probe fault to BPF stderr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3571660.QJadu78ljV@7950hx>
Date: Tue, 07 Oct 2025 14:14:35 +0800
From: Menglong Dong <menglong.dong@...ux.dev>
To: Menglong Dong <menglong8.dong@...il.com>,
 Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>, bpf <bpf@...r.kernel.org>,
 LKML <linux-kernel@...r.kernel.org>,
 linux-trace-kernel <linux-trace-kernel@...r.kernel.org>, jiang.biao@...ux.dev
Subject: Re: [PATCH RFC bpf-next 1/3] bpf: report probe fault to BPF stderr

On 2025/10/2 10:03, Alexei Starovoitov wrote:
> On Fri, Sep 26, 2025 at 11:12 PM Menglong Dong <menglong8.dong@...il.com> wrote:
> >
> > Introduce the function bpf_prog_report_probe_violation(), which is used
> > to report the memory probe fault to the user by the BPF stderr.
> >
> > Signed-off-by: Menglong Dong <menglong.dong@...ux.dev>
> > ---
> >  include/linux/bpf.h      |  1 +
> >  kernel/trace/bpf_trace.c | 18 ++++++++++++++++++
> >  2 files changed, 19 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 6338e54a9b1f..a31c5ce56c32 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -2902,6 +2902,7 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
> >  void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
> >  void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr);
> >  void bpf_prog_report_arena_violation(bool write, unsigned long addr, unsigned long fault_ip);
> > +void bpf_prog_report_probe_violation(bool write, unsigned long fault_ip);
> >
> >  #else /* !CONFIG_BPF_SYSCALL */
> >  static inline struct bpf_prog *bpf_prog_get(u32 ufd)
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 8f23f5273bab..9bd03a9f53db 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2055,6 +2055,24 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
> >         module_put(mod);
> >  }
> >
> > +void bpf_prog_report_probe_violation(bool write, unsigned long fault_ip)
> > +{
> > +       struct bpf_stream_stage ss;
> > +       struct bpf_prog *prog;
> > +
> > +       rcu_read_lock();
> > +       prog = bpf_prog_ksym_find(fault_ip);
> > +       rcu_read_unlock();
> > +       if (!prog)
> > +               return;
> > +
> > +       bpf_stream_stage(ss, prog, BPF_STDERR, ({
> > +               bpf_stream_printk(ss, "ERROR: Probe %s access faule, insn=0x%lx\n",
> > +                                 write ? "WRITE" : "READ", fault_ip);
> > +               bpf_stream_dump_stack(ss);
> > +       }));
> 
> Interesting idea, but the above message is not helpful.
> Users cannot decipher a fault_ip within a bpf prog.
> It's just a random number.

Yeah, I have noticed this too. What useful is the
bpf_stream_dump_stack(), which will print the code
line that trigger the fault.

> But stepping back... just faults are common in tracing.
> If we start printing them we will just fill the stream to the max,
> but users won't know that the message is there, since no one

You are right, we definitely can't output this message
to STDERR directly. We can add an extra flag for it, as you
said below.

Or, maybe we can introduce a enum stream_type, and
the users can subscribe what kind of messages they
want to receive.

> expects it. arena and lock errors are rare and arena faults
> were specifically requested by folks who develop progs that use arena.
> This one is different. These faults have been around for a long time
> and I don't recall people asking for more verbosity.
> We can add them with an extra flag specified at prog load time,
> but even then. Doesn't feel that useful.

Generally speaking, users can do invalid checking before
they do the memory reading, such as NULL checking. And
the pointer in function arguments that we hook is initialized
in most case. So the fault is someting that can be prevented.

I have a BPF tools which is writed for 4.X kernel and kprobe
based BPF is used. Now I'm planing to migrate it to 6.X kernel
and replace bpf_probe_read_kernel() with bpf_core_cast() to
obtain better performance. Then I find that I can't check if the
memory reading is success, which can lead to potential risk.
So my tool will be happy to get such fault event :)

Leon suggested to add a global errno for each BPF programs,
and I haven't dig deeply on this idea yet.

Thanks!
Menglong Dong

> 
>