netdev - Re: [BUG] possible deadlock in __schedule (with reproducer available)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAADnVQLBhV_sSuH+BKu66ZsxTcsvw7RSLnjRGLwQX3TFSjs-Gg@mail.gmail.com>
Date: Sun, 24 Nov 2024 18:02:35 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ruan Bonan <bonan.ruan@...us.edu>, 
	"mingo@...hat.com" <mingo@...hat.com>, "will@...nel.org" <will@...nel.org>, 
	"longman@...hat.com" <longman@...hat.com>, "boqun.feng@...il.com" <boqun.feng@...il.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "kpsingh@...nel.org" <kpsingh@...nel.org>, 
	"mattbobrowski@...gle.com" <mattbobrowski@...gle.com>, "ast@...nel.org" <ast@...nel.org>, 
	"daniel@...earbox.net" <daniel@...earbox.net>, "andrii@...nel.org" <andrii@...nel.org>, 
	"martin.lau@...ux.dev" <martin.lau@...ux.dev>, "eddyz87@...il.com" <eddyz87@...il.com>, 
	"song@...nel.org" <song@...nel.org>, "yonghong.song@...ux.dev" <yonghong.song@...ux.dev>, 
	"john.fastabend@...il.com" <john.fastabend@...il.com>, "sdf@...ichev.me" <sdf@...ichev.me>, 
	"haoluo@...gle.com" <haoluo@...gle.com>, "jolsa@...nel.org" <jolsa@...nel.org>, 
	"mhiramat@...nel.org" <mhiramat@...nel.org>, 
	"mathieu.desnoyers@...icios.com" <mathieu.desnoyers@...icios.com>, 
	"bpf@...r.kernel.org" <bpf@...r.kernel.org>, 
	"linux-trace-kernel@...r.kernel.org" <linux-trace-kernel@...r.kernel.org>, 
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, Fu Yeqi <e1374359@...us.edu>
Subject: Re: [BUG] possible deadlock in __schedule (with reproducer available)

On Sat, Nov 23, 2024 at 2:59 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Sat, 23 Nov 2024 21:27:44 +0100
> Peter Zijlstra <peterz@...radead.org> wrote:
>
> > On Sat, Nov 23, 2024 at 03:39:45AM +0000, Ruan Bonan wrote:
> >
> > >  </TASK>
> > > FAULT_INJECTION: forcing a failure.
> > > name fail_usercopy, interval 1, probability 0, space 0, times 0
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 6.12.0-rc7-00144-g66418447d27b #8 Not tainted
> > > ------------------------------------------------------
> > > syz-executor144/330 is trying to acquire lock:
> > > ffffffffbcd2da38 ((console_sem).lock){....}-{2:2}, at: down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
> > >
> > > but task is already holding lock:
> > > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:598 [inline]
> > > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1506 [inline]
> > > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1805 [inline]
> > > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x140/0x1e70 kernel/sched/core.c:6592
> > >
> > > which lock already depends on the new lock.
> > >
> > >        _printk+0x7a/0xa0 kernel/printk/printk.c:2432
> > >        fail_dump lib/fault-inject.c:46 [inline]
> > >        should_fail_ex+0x3be/0x570 lib/fault-inject.c:154
> > >        strncpy_from_user+0x36/0x230 lib/strncpy_from_user.c:118
> > >        strncpy_from_user_nofault+0x71/0x140 mm/maccess.c:186
> > >        bpf_probe_read_user_str_common kernel/trace/bpf_trace.c:215 [inline]
> > >        ____bpf_probe_read_user_str kernel/trace/bpf_trace.c:224 [inline]
> > >        bpf_probe_read_user_str+0x2a/0x70 kernel/trace/bpf_trace.c:221
> > >        bpf_prog_bc7c5c6b9645592f+0x3e/0x40
> > >        bpf_dispatcher_nop_func include/linux/bpf.h:1265 [inline]
> > >        __bpf_prog_run include/linux/filter.h:701 [inline]
> > >        bpf_prog_run include/linux/filter.h:708 [inline]
> > >        __bpf_trace_run kernel/trace/bpf_trace.c:2316 [inline]
> > >        bpf_trace_run4+0x30b/0x4d0 kernel/trace/bpf_trace.c:2359
> > >        __bpf_trace_sched_switch+0x1c6/0x2c0 include/trace/events/sched.h:222
> > >        trace_sched_switch+0x12a/0x190 include/trace/events/sched.h:222
> >
> > -EWONTFIX. Don't do stupid.
>
> Ack. BPF should not be causing deadlocks by doing code called from
> tracepoints.

I sense so much BPF love here that it diminishes the ability to read
stack traces :)
Above is one of many printk related splats that syzbot keeps finding.
This is not a new issue and it has nothing to do with bpf.

> Tracepoints have a special context similar to NMIs. If you add
> a hook into an NMI handler that causes a deadlock, it's a bug in the hook,
> not the NMI code. If you add code that causes a deadlock when attaching to a
> tracepoint, it's a bug in the hook, not the tracepoint.

trace events call strncpy_from_user_nofault() just as well.
kernel/trace/trace_events_filter.c:830