lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fae16781-3b08-4315-b916-bee2bbc0c495@paulmck-laptop>
Date: Wed, 23 Oct 2024 07:13:29 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
	Alexander Potapenko <glider@...gle.com>,
	syzbot <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com>,
	audit@...r.kernel.org, eparis@...hat.com,
	linux-kernel@...r.kernel.org, paul@...l-moore.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities

On Tue, Oct 22, 2024 at 07:31:28AM -0700, Paul E. McKenney wrote:
> On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote:
> > On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote:
> > > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote:
> > > > On Fri, Sep 27, 2024 at 4:57 PM syzbot
> > > > <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit:    075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k..
> > > > > git tree:       upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
> > > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > >
> > > > > Downloadable assets:
> > > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
> > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
> > > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com
> > [...]
> > > > +PeterZ, who added the KCSAN assertion.
> > > 
> > > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for
> > > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split
> > > DEQUEUE_SLEEP from deactivate_task()").
> > > 
> > > I'm not at all sure I have any inkling as to what the annotation does
> > > nor what KCSAN is trying to tell us above.
> > 
> > ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no
> > concurrent writes to var; other readers are allowed. If KCSAN is
> > enabled, it then goes and reports any violations of that assertion.
> > Main usecase is for already marked accesses where concurrent accesses
> > are _not_ data races, but the algorithm does not assume concurrent
> > writers regardless.
> > 
> > In this case it seems that Paul was trying to say that there should be
> > no concurrent writers to this variable. But KCSAN disproved that.
> 
> Just confirming that this was my intent.

Except that it looks to me that Peter added this one, not me.  ;-)

> And for all I know, maybe it is now OK to have concurrent writers to
> that variable, but if so, would we please have an explanatory comment
> (or a reference to one)?

							Thanx, Paul

> > > Can someone please translate?
> > 
> > We can get the 2nd stack trace with:
> > 
> > 	--- a/kernel/sched/Makefile
> > 	+++ b/kernel/sched/Makefile
> > 	@@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n
> > 
> > 	 # Disable KCSAN to avoid excessive noise and performance degradation. To avoid
> > 	 # false positives ensure barriers implied by sched functions are instrumented.
> > 	-KCSAN_SANITIZE := n
> > 	-KCSAN_INSTRUMENT_BARRIERS := y
> > 	+#KCSAN_SANITIZE := n
> > 	+#KCSAN_INSTRUMENT_BARRIERS := y
> > 
> > Which gives us:
> > 
> >  | ==================================================================
> >  | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate
> >  | 
> >  | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0:
> >  |  activate_task kernel/sched/core.c:2064 [inline]
> > 
> > This is this one:
> > 
> > 	void activate_task(struct rq *rq, struct task_struct *p, int flags)
> > 	{
> > 		if (task_on_rq_migrating(p))
> > 			flags |= ENQUEUE_MIGRATED;
> > 		if (flags & ENQUEUE_MIGRATED)
> > 			sched_mm_cid_migrate_to(rq, p);
> > 
> > 		enqueue_task(rq, p, flags);
> > 
> > 		WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
> > 		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> > 	}
> > 
> >  |  ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671
> >  |  ttwu_queue kernel/sched/core.c:3944 [inline]
> >  |  try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270
> >  |  default_wake_function+0x25/0x30 kernel/sched/core.c:7009
> >  |  __pollwake fs/select.c:205 [inline]
> >  |  pollwake+0xc0/0x100 fs/select.c:215
> >  |  __wake_up_common kernel/sched/wait.c:89 [inline]
> >  |  __wake_up_common_lock kernel/sched/wait.c:106 [inline]
> >  |  __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173
> >  |  sock_def_readable+0x6f/0x180 net/core/sock.c:3442
> >  |  tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193
> >  |  tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283
> >  |  tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237
> >  |  tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915
> >  |  tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350
> >  |  ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205
> >  |  ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233
> >  |  NF_HOOK include/linux/netfilter.h:314 [inline]
> >  |  ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254
> >  |  dst_input include/net/dst.h:460 [inline]
> >  |  ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
> >  |  ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline]
> >  |  ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638
> >  |  ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672
> >  |  __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline]
> >  |  __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756
> >  |  __netif_receive_skb_list net/core/dev.c:5808 [inline]
> >  |  netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899
> >  |  gro_normal_list include/net/gro.h:515 [inline]
> >  |  napi_complete_done+0x161/0x3a0 net/core/dev.c:6250
> >  |  e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
> >  |  __napi_poll+0x66/0x360 net/core/dev.c:6775
> >  |  napi_poll net/core/dev.c:6844 [inline]
> >  |  net_rx_action+0x3d9/0x820 net/core/dev.c:6966
> >  |  handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554
> >  |  __do_softirq kernel/softirq.c:588 [inline]
> >  |  invoke_softirq kernel/softirq.c:428 [inline]
> >  |  __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637
> >  |  common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278
> >  |  asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
> >  | 
> >  | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3:
> >  |  __block_task kernel/sched/sched.h:2770 [inline]
> > 
> > And that's:
> > 
> > 	static inline void __block_task(struct rq *rq, struct task_struct *p)
> > 	{
> > 		WRITE_ONCE(p->on_rq, 0);
> > 		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> > 		if (p->sched_contributes_to_load)
> > 			rq->nr_uninterruptible++;
> > 
> >  |  dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177
> >  |  pick_next_entity kernel/sched/fair.c:5627 [inline]
> >  |  pick_task_fair kernel/sched/fair.c:8856 [inline]
> >  |  pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876
> >  |  __pick_next_task kernel/sched/core.c:5955 [inline]
> >  |  pick_next_task kernel/sched/core.c:6477 [inline]
> >  |  __schedule+0x47a/0x1130 kernel/sched/core.c:6629
> >  |  __schedule_loop kernel/sched/core.c:6752 [inline]
> >  |  schedule+0x7b/0x130 kernel/sched/core.c:6767
> >  |  do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032
> >  |  hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080
> >  |  common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365
> >  |  __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline]
> >  |  __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388
> >  |  __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388
> >  |  x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231
> >  |  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  |  do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83
> >  |  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >  | 
> >  | Reported by Kernel Concurrency Sanitizer on:
> >  | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7
> >  | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> >  | ==================================================================

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ