lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxerZIxg8kAMCvYc@elver.google.com>
Date: Tue, 22 Oct 2024 15:40:52 +0200
From: Marco Elver <elver@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>, paulmck@...nel.org
Cc: Alexander Potapenko <glider@...gle.com>,
	syzbot <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com>,
	audit@...r.kernel.org, eparis@...hat.com,
	linux-kernel@...r.kernel.org, paul@...l-moore.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities

On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote:
> > On Fri, Sep 27, 2024 at 4:57 PM syzbot
> > <syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@...kaller.appspotmail.com
[...]
> > +PeterZ, who added the KCSAN assertion.
> 
> Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for
> p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split
> DEQUEUE_SLEEP from deactivate_task()").
> 
> I'm not at all sure I have any inkling as to what the annotation does
> nor what KCSAN is trying to tell us above.

ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no
concurrent writes to var; other readers are allowed. If KCSAN is
enabled, it then goes and reports any violations of that assertion.
Main usecase is for already marked accesses where concurrent accesses
are _not_ data races, but the algorithm does not assume concurrent
writers regardless.

In this case it seems that Paul was trying to say that there should be
no concurrent writers to this variable. But KCSAN disproved that.

> Can someone please translate?

We can get the 2nd stack trace with:

	--- a/kernel/sched/Makefile
	+++ b/kernel/sched/Makefile
	@@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n

	 # Disable KCSAN to avoid excessive noise and performance degradation. To avoid
	 # false positives ensure barriers implied by sched functions are instrumented.
	-KCSAN_SANITIZE := n
	-KCSAN_INSTRUMENT_BARRIERS := y
	+#KCSAN_SANITIZE := n
	+#KCSAN_INSTRUMENT_BARRIERS := y

Which gives us:

 | ==================================================================
 | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate
 | 
 | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0:
 |  activate_task kernel/sched/core.c:2064 [inline]

This is this one:

	void activate_task(struct rq *rq, struct task_struct *p, int flags)
	{
		if (task_on_rq_migrating(p))
			flags |= ENQUEUE_MIGRATED;
		if (flags & ENQUEUE_MIGRATED)
			sched_mm_cid_migrate_to(rq, p);

		enqueue_task(rq, p, flags);

		WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
	}

 |  ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671
 |  ttwu_queue kernel/sched/core.c:3944 [inline]
 |  try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270
 |  default_wake_function+0x25/0x30 kernel/sched/core.c:7009
 |  __pollwake fs/select.c:205 [inline]
 |  pollwake+0xc0/0x100 fs/select.c:215
 |  __wake_up_common kernel/sched/wait.c:89 [inline]
 |  __wake_up_common_lock kernel/sched/wait.c:106 [inline]
 |  __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173
 |  sock_def_readable+0x6f/0x180 net/core/sock.c:3442
 |  tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193
 |  tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283
 |  tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237
 |  tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915
 |  tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350
 |  ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205
 |  ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233
 |  NF_HOOK include/linux/netfilter.h:314 [inline]
 |  ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254
 |  dst_input include/net/dst.h:460 [inline]
 |  ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
 |  ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline]
 |  ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638
 |  ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672
 |  __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline]
 |  __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756
 |  __netif_receive_skb_list net/core/dev.c:5808 [inline]
 |  netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899
 |  gro_normal_list include/net/gro.h:515 [inline]
 |  napi_complete_done+0x161/0x3a0 net/core/dev.c:6250
 |  e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
 |  __napi_poll+0x66/0x360 net/core/dev.c:6775
 |  napi_poll net/core/dev.c:6844 [inline]
 |  net_rx_action+0x3d9/0x820 net/core/dev.c:6966
 |  handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554
 |  __do_softirq kernel/softirq.c:588 [inline]
 |  invoke_softirq kernel/softirq.c:428 [inline]
 |  __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637
 |  common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278
 |  asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
 | 
 | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3:
 |  __block_task kernel/sched/sched.h:2770 [inline]

And that's:

	static inline void __block_task(struct rq *rq, struct task_struct *p)
	{
		WRITE_ONCE(p->on_rq, 0);
		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
		if (p->sched_contributes_to_load)
			rq->nr_uninterruptible++;

 |  dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177
 |  pick_next_entity kernel/sched/fair.c:5627 [inline]
 |  pick_task_fair kernel/sched/fair.c:8856 [inline]
 |  pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876
 |  __pick_next_task kernel/sched/core.c:5955 [inline]
 |  pick_next_task kernel/sched/core.c:6477 [inline]
 |  __schedule+0x47a/0x1130 kernel/sched/core.c:6629
 |  __schedule_loop kernel/sched/core.c:6752 [inline]
 |  schedule+0x7b/0x130 kernel/sched/core.c:6767
 |  do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032
 |  hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080
 |  common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365
 |  __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline]
 |  __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388
 |  __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388
 |  x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231
 |  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 |  do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83
 |  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 | 
 | Reported by Kernel Concurrency Sanitizer on:
 | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7
 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 | ==================================================================

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ