[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <29c58126-3146-4c61-8166-a894c0e84d08@efficios.com>
Date: Tue, 22 Oct 2024 11:19:01 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>, Jordan Rife <jrife@...gle.com>
Cc: syzbot+b390c8062d8387b6272a@...kaller.appspotmail.com, andrii@...nel.org,
ast@...nel.org, bpf@...r.kernel.org, daniel@...earbox.net,
eddyz87@...il.com, haoluo@...gle.com, john.fastabend@...il.com,
jolsa@...nel.org, kpsingh@...nel.org, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, martin.lau@...ux.dev,
mattbobrowski@...gle.com, mhiramat@...nel.org, sdf@...ichev.me,
song@...nel.org, syzkaller-bugs@...glegroups.com, yonghong.song@...ux.dev
Subject: Re: [syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in
bpf_trace_run2 (2)
On 2024-10-22 04:20, Steven Rostedt wrote:
>
> Mathieu, can you look at this?
Sure,
>
> [ more below ]
>
> On Mon, 21 Oct 2024 18:23:47 +0000
> Jordan Rife <jrife@...gle.com> wrote:
>
>> I performed a bisection and this issue starts with commit a363d27cdbc2
>> ("tracing: Allow system call tracepoints to handle page faults") which
>> introduces this change.
>>
>>> + *
>>> + * With @syscall=0, the tracepoint callback array dereference is
>>> + * protected by disabling preemption.
>>> + * With @syscall=1, the tracepoint callback array dereference is
>>> + * protected by Tasks Trace RCU, which allows probes to handle page
>>> + * faults.
>>> */
>>> #define __DO_TRACE(name, args, cond, syscall) \
>>> do { \
>>> @@ -204,11 +212,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
>>> if (!(cond)) \
>>> return; \
>>> \
>>> - preempt_disable_notrace(); \
>>> + if (syscall) \
>>> + rcu_read_lock_trace(); \
>>> + else \
>>> + preempt_disable_notrace(); \
>>> \
>>> __DO_TRACE_CALL(name, TP_ARGS(args)); \
>>> \
>>> - preempt_enable_notrace(); \
>>> + if (syscall) \
>>> + rcu_read_unlock_trace(); \
>>> + else \
>>> + preempt_enable_notrace(); \
>>> } while (0)
>>
>> Link: https://lore.kernel.org/bpf/20241009010718.2050182-6-mathieu.desnoyers@efficios.com/
>>
>> I reproduced the bug locally by running syz-execprog inside a QEMU VM.
>>
>>> ./syz-execprog -repeat=0 -procs=5 ./repro.syz.txt
>>
>> I /think/ what is happening is that with this change preemption may now
>> occur leading to a scenario where the RCU grace period is insufficient
>> in a few places where call_rcu() is used. In other words, there are a
>> few scenarios where call_rcu_tasks_trace() should be used instead to
>> prevent a use-after-free bug when a preempted tracepoint call tries to
>> access a program, link, etc. that was freed. It seems the syzkaller
>> program induces page faults while attaching raw tracepoints to
>> sys_enter making preemption more likely to occur.
>>
>> kernel/tracepoint.c
>> ===================
>>> ...
>>> static inline void release_probes(struct tracepoint_func *old)
>>> {
>>> ...
>>> call_rcu(&tp_probes->rcu, rcu_free_old_probes); <-- Here
>
> Have you tried just changing this one to call_rcu_tasks_trace()?
Actually, I see two possible solutions there:
1) If we want to keep unchanged register/unregister functions as a single
API for normal and syscall tracepoints, and we don't want to add additional
flags in the tracepoint struct, then we need to chain the call_rcu:
call_rcu() -> rcu_free_old_probes -> call_rcu_tasks_trace() -> rcu_tasks_trace_free_old_probes.
Just like we did when we used SRCU.
This is not perfect because we'd be adding extra delay for reclaim of
the old probes array for every tracepoint being updated, not just the
syscall tracepoints, but we probably don't care. This is a straightforward
initial solution.
We did something similar with SRCU before and it was OK.
2) If we want something more elegant, we can add a flag to struct tracepoint
for syscall tracepoints, and use that flag to choose between call_rcu()
and call_rcu_tasks_trace(). But maybe this additional complexity is not
even useful.
I'll prepare a patch implementing (1) and send it your way as RFC for further
testing.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists