lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62aaa141-e41d-4e7e-95eb-c48e4f7fc558@paulmck-laptop>
Date: Tue, 22 Oct 2024 07:31:28 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
	Alexander Potapenko <glider@...gle.com>,
	syzbot <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com>,
	audit@...r.kernel.org, eparis@...hat.com,
	linux-kernel@...r.kernel.org, paul@...l-moore.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities

On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote:
> On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote:
> > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote:
> > > On Fri, Sep 27, 2024 at 4:57 PM syzbot
> > > <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com
> [...]
> > > +PeterZ, who added the KCSAN assertion.
> > 
> > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for
> > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split
> > DEQUEUE_SLEEP from deactivate_task()").
> > 
> > I'm not at all sure I have any inkling as to what the annotation does
> > nor what KCSAN is trying to tell us above.
> 
> ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no
> concurrent writes to var; other readers are allowed. If KCSAN is
> enabled, it then goes and reports any violations of that assertion.
> Main usecase is for already marked accesses where concurrent accesses
> are _not_ data races, but the algorithm does not assume concurrent
> writers regardless.
> 
> In this case it seems that Paul was trying to say that there should be
> no concurrent writers to this variable. But KCSAN disproved that.

Just confirming that this was my intent.

And for all I know, maybe it is now OK to have concurrent writers to
that variable, but if so, would we please have an explanatory comment
(or a reference to one)?

							Thanx, Paul

> > Can someone please translate?
> 
> We can get the 2nd stack trace with:
> 
> 	--- a/kernel/sched/Makefile
> 	+++ b/kernel/sched/Makefile
> 	@@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n
> 
> 	 # Disable KCSAN to avoid excessive noise and performance degradation. To avoid
> 	 # false positives ensure barriers implied by sched functions are instrumented.
> 	-KCSAN_SANITIZE := n
> 	-KCSAN_INSTRUMENT_BARRIERS := y
> 	+#KCSAN_SANITIZE := n
> 	+#KCSAN_INSTRUMENT_BARRIERS := y
> 
> Which gives us:
> 
>  | ==================================================================
>  | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate
>  | 
>  | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0:
>  |  activate_task kernel/sched/core.c:2064 [inline]
> 
> This is this one:
> 
> 	void activate_task(struct rq *rq, struct task_struct *p, int flags)
> 	{
> 		if (task_on_rq_migrating(p))
> 			flags |= ENQUEUE_MIGRATED;
> 		if (flags & ENQUEUE_MIGRATED)
> 			sched_mm_cid_migrate_to(rq, p);
> 
> 		enqueue_task(rq, p, flags);
> 
> 		WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
> 		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> 	}
> 
>  |  ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671
>  |  ttwu_queue kernel/sched/core.c:3944 [inline]
>  |  try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270
>  |  default_wake_function+0x25/0x30 kernel/sched/core.c:7009
>  |  __pollwake fs/select.c:205 [inline]
>  |  pollwake+0xc0/0x100 fs/select.c:215
>  |  __wake_up_common kernel/sched/wait.c:89 [inline]
>  |  __wake_up_common_lock kernel/sched/wait.c:106 [inline]
>  |  __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173
>  |  sock_def_readable+0x6f/0x180 net/core/sock.c:3442
>  |  tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193
>  |  tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283
>  |  tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237
>  |  tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915
>  |  tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350
>  |  ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205
>  |  ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233
>  |  NF_HOOK include/linux/netfilter.h:314 [inline]
>  |  ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254
>  |  dst_input include/net/dst.h:460 [inline]
>  |  ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
>  |  ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline]
>  |  ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638
>  |  ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672
>  |  __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline]
>  |  __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756
>  |  __netif_receive_skb_list net/core/dev.c:5808 [inline]
>  |  netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899
>  |  gro_normal_list include/net/gro.h:515 [inline]
>  |  napi_complete_done+0x161/0x3a0 net/core/dev.c:6250
>  |  e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
>  |  __napi_poll+0x66/0x360 net/core/dev.c:6775
>  |  napi_poll net/core/dev.c:6844 [inline]
>  |  net_rx_action+0x3d9/0x820 net/core/dev.c:6966
>  |  handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554
>  |  __do_softirq kernel/softirq.c:588 [inline]
>  |  invoke_softirq kernel/softirq.c:428 [inline]
>  |  __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637
>  |  common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278
>  |  asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
>  | 
>  | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3:
>  |  __block_task kernel/sched/sched.h:2770 [inline]
> 
> And that's:
> 
> 	static inline void __block_task(struct rq *rq, struct task_struct *p)
> 	{
> 		WRITE_ONCE(p->on_rq, 0);
> 		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> 		if (p->sched_contributes_to_load)
> 			rq->nr_uninterruptible++;
> 
>  |  dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177
>  |  pick_next_entity kernel/sched/fair.c:5627 [inline]
>  |  pick_task_fair kernel/sched/fair.c:8856 [inline]
>  |  pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876
>  |  __pick_next_task kernel/sched/core.c:5955 [inline]
>  |  pick_next_task kernel/sched/core.c:6477 [inline]
>  |  __schedule+0x47a/0x1130 kernel/sched/core.c:6629
>  |  __schedule_loop kernel/sched/core.c:6752 [inline]
>  |  schedule+0x7b/0x130 kernel/sched/core.c:6767
>  |  do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032
>  |  hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080
>  |  common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365
>  |  __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline]
>  |  __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388
>  |  __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388
>  |  x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231
>  |  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  |  do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83
>  |  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  | 
>  | Reported by Kernel Concurrency Sanitizer on:
>  | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7
>  | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>  | ==================================================================

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ