linux-kernel - Re: [RFC PATCH v4 4/4] tracing: Add might_fault() check in __DO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADKFtnTUmRe1T92BQ_NB=V7DW13hAKpA40rm+m6DkpTNf5RyFw@mail.gmail.com>
Date: Wed, 30 Oct 2024 10:34:40 -0700
From: Jordan Rife <jrife@...gle.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, 
	Andrii Nakryiko <andrii.nakryiko@...il.com>, Alexei Starovoitov <ast@...nel.org>, bpf <bpf@...r.kernel.org>, 
	Joel Fernandes <joel@...lfernandes.org>, LKML <linux-kernel@...r.kernel.org>, 
	Mark Rutland <mark.rutland@....com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Ingo Molnar <mingo@...hat.com>, 
	Michael Jeanson <mjeanson@...icios.com>, Namhyung Kim <namhyung@...nel.org>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	Steven Rostedt <rostedt@...dmis.org>, Thomas Gleixner <tglx@...utronix.de>, Yonghong Song <yhs@...com>
Subject: Re: [RFC PATCH v4 4/4] tracing: Add might_fault() check in
 __DO_TRACE() for syscall

> > [  687.334265][T16276] allocated by task 16281 on cpu 1 at 683.953385s (3.380878s ago):
> > [  687.335615][T16276]  tracepoint_add_func+0x28a/0xd90
> > [  687.336424][T16276]  tracepoint_probe_register_prio_may_exist+0xa2/0xf0
> > [  687.337416][T16276]  bpf_probe_register+0x186/0x200
> > [  687.338174][T16276]  bpf_raw_tp_link_attach+0x21f/0x540
> > [  687.339233][T16276]  __sys_bpf+0x393/0x4fa0
> > [  687.340042][T16276]  __x64_sys_bpf+0x78/0xc0
> > [  687.340801][T16276]  do_syscall_64+0xcb/0x250
> > [  687.341623][T16276]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> I think the stack trace points out that the patch [1] isn't really fixing it.
> UAF is on access to bpf_link in __traceiter_sys_enter

The stack trace points to the memory in question being allocated by
tracepoint_add_func where allocation and assignment to
__tracepoint_sys_enter->funcs happens. Mathieu's patch addresses
use-after-free on this structure by using call_rcu_tasks_trace inside
release_probes. In contrast, here is what the "allocated by" trace
looks like for UAF on access to bpf_link (copied from the original
KASAN crash report [4]).

> Allocated by task 5681:
>  kasan_save_stack mm/kasan/common.c:47 [inline]
>  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
>  kasan_kmalloc include/linux/kasan.h:260 [inline]
>  __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4304
>  kmalloc_noprof include/linux/slab.h:901 [inline]
>  kzalloc_noprof include/linux/slab.h:1037 [inline]
>  bpf_raw_tp_link_attach+0x2a0/0x6e0 kernel/bpf/syscall.c:3829
>  bpf_raw_tracepoint_open+0x177/0x1f0 kernel/bpf/syscall.c:3876
>  __sys_bpf+0x3c0/0x810 kernel/bpf/syscall.c:5691
>  __do_sys_bpf kernel/bpf/syscall.c:5756 [inline]
>  __se_sys_bpf kernel/bpf/syscall.c:5754 [inline]
>  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5754
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f

This clearly points to where memory for a bpf_link is allocated.

> link = kzalloc(sizeof(*link), GFP_USER);
> if (!link) {
>     err = -ENOMEM;
>     goto out_put_btp;
> }

To add some context, if I apply Mathieu's patch alone then I get no
meaningful test signal when running the reproducer, because the UAF
crash almost always occurs first on accesses to bpf_link or bpf_prog
showing a trace like the second one above. My intent in applying patch
[1] is to mask out these sources of UAF-related crashes on the BPF
side to just focus on what this series addresses. This series should
eventually be tested end-to-end with Andrii's fix for the BPF stuff
that he mentioned last week, but that would rely on this patch series,
tracepoint_is_faultable() in particular, so it's kind of chicken and
egg in terms of testing. In the meantime, [1] provides a bandaid to
allow some degree of test coverage on this patch.

> while your patch [1] and all attempts to "fix" were delaying bpf_prog.
> The issue is not reproducing anymore due to luck.

[1] chains call_rcu_tasks_trace and call_rcu to free both bpf_prog and
bpf_link after unregistering the trace point. This grace period should
be sufficient to prevent UAF on these structures from the syscall TP
handlers which are protected with rcu_read_lock_trace. I've run the
reproducer many times. Without /some/ fix on the BPF side it crashes
reliably within seconds here. Using call_rcu_tasks_trace or chaining
it with call_rcu eliminates UAF on the BPF stuff which eliminates a
couple of variables for local testing.

If you are not convinced, I'm happy to run through other test
scenarios or run the reproducer for much longer.

-Jordan