[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62aaa141-e41d-4e7e-95eb-c48e4f7fc558@paulmck-laptop>
Date: Tue, 22 Oct 2024 07:31:28 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Alexander Potapenko <glider@...gle.com>,
syzbot <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com>,
audit@...r.kernel.org, eparis@...hat.com,
linux-kernel@...r.kernel.org, paul@...l-moore.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities
On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote:
> On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote:
> > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote:
> > > On Fri, Sep 27, 2024 at 4:57 PM syzbot
> > > <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
> > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com
> [...]
> > > +PeterZ, who added the KCSAN assertion.
> >
> > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for
> > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split
> > DEQUEUE_SLEEP from deactivate_task()").
> >
> > I'm not at all sure I have any inkling as to what the annotation does
> > nor what KCSAN is trying to tell us above.
>
> ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no
> concurrent writes to var; other readers are allowed. If KCSAN is
> enabled, it then goes and reports any violations of that assertion.
> Main usecase is for already marked accesses where concurrent accesses
> are _not_ data races, but the algorithm does not assume concurrent
> writers regardless.
>
> In this case it seems that Paul was trying to say that there should be
> no concurrent writers to this variable. But KCSAN disproved that.
Just confirming that this was my intent.
And for all I know, maybe it is now OK to have concurrent writers to
that variable, but if so, would we please have an explanatory comment
(or a reference to one)?
Thanx, Paul
> > Can someone please translate?
>
> We can get the 2nd stack trace with:
>
> --- a/kernel/sched/Makefile
> +++ b/kernel/sched/Makefile
> @@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n
>
> # Disable KCSAN to avoid excessive noise and performance degradation. To avoid
> # false positives ensure barriers implied by sched functions are instrumented.
> -KCSAN_SANITIZE := n
> -KCSAN_INSTRUMENT_BARRIERS := y
> +#KCSAN_SANITIZE := n
> +#KCSAN_INSTRUMENT_BARRIERS := y
>
> Which gives us:
>
> | ==================================================================
> | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate
> |
> | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0:
> | activate_task kernel/sched/core.c:2064 [inline]
>
> This is this one:
>
> void activate_task(struct rq *rq, struct task_struct *p, int flags)
> {
> if (task_on_rq_migrating(p))
> flags |= ENQUEUE_MIGRATED;
> if (flags & ENQUEUE_MIGRATED)
> sched_mm_cid_migrate_to(rq, p);
>
> enqueue_task(rq, p, flags);
>
> WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
> ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> }
>
> | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671
> | ttwu_queue kernel/sched/core.c:3944 [inline]
> | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270
> | default_wake_function+0x25/0x30 kernel/sched/core.c:7009
> | __pollwake fs/select.c:205 [inline]
> | pollwake+0xc0/0x100 fs/select.c:215
> | __wake_up_common kernel/sched/wait.c:89 [inline]
> | __wake_up_common_lock kernel/sched/wait.c:106 [inline]
> | __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173
> | sock_def_readable+0x6f/0x180 net/core/sock.c:3442
> | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193
> | tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283
> | tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237
> | tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915
> | tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350
> | ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205
> | ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233
> | NF_HOOK include/linux/netfilter.h:314 [inline]
> | ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254
> | dst_input include/net/dst.h:460 [inline]
> | ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
> | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline]
> | ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638
> | ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672
> | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline]
> | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756
> | __netif_receive_skb_list net/core/dev.c:5808 [inline]
> | netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899
> | gro_normal_list include/net/gro.h:515 [inline]
> | napi_complete_done+0x161/0x3a0 net/core/dev.c:6250
> | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
> | __napi_poll+0x66/0x360 net/core/dev.c:6775
> | napi_poll net/core/dev.c:6844 [inline]
> | net_rx_action+0x3d9/0x820 net/core/dev.c:6966
> | handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554
> | __do_softirq kernel/softirq.c:588 [inline]
> | invoke_softirq kernel/softirq.c:428 [inline]
> | __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637
> | common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278
> | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
> |
> | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3:
> | __block_task kernel/sched/sched.h:2770 [inline]
>
> And that's:
>
> static inline void __block_task(struct rq *rq, struct task_struct *p)
> {
> WRITE_ONCE(p->on_rq, 0);
> ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> if (p->sched_contributes_to_load)
> rq->nr_uninterruptible++;
>
> | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177
> | pick_next_entity kernel/sched/fair.c:5627 [inline]
> | pick_task_fair kernel/sched/fair.c:8856 [inline]
> | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876
> | __pick_next_task kernel/sched/core.c:5955 [inline]
> | pick_next_task kernel/sched/core.c:6477 [inline]
> | __schedule+0x47a/0x1130 kernel/sched/core.c:6629
> | __schedule_loop kernel/sched/core.c:6752 [inline]
> | schedule+0x7b/0x130 kernel/sched/core.c:6767
> | do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032
> | hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080
> | common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365
> | __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline]
> | __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388
> | __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388
> | x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231
> | do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> | do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83
> | entry_SYSCALL_64_after_hwframe+0x77/0x7f
> |
> | Reported by Kernel Concurrency Sanitizer on:
> | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7
> | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> | ==================================================================
Powered by blists - more mailing lists