lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <77c8887d-e69c-4554-9c9f-c9d755c7aff5@paulmck-laptop>
Date: Fri, 3 Jan 2025 17:14:46 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: syzbot <syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com>,
	acme@...nel.org, adrian.hunter@...el.com,
	alexander.shishkin@...ux.intel.com, irogers@...gle.com,
	jolsa@...nel.org, kan.liang@...ux.intel.com,
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, mark.rutland@....com,
	mhiramat@...nel.org, mingo@...hat.com, namhyung@...nel.org,
	oleg@...hat.com, peterz@...radead.org,
	syzkaller-bugs@...glegroups.com, RCU <rcu@...r.kernel.org>
Subject: Re: [syzbot] [perf?] [trace?] KCSAN: assert: race in
 srcu_gp_start_if_needed

On Sun, Nov 24, 2024 at 11:48:46PM +0100, Marco Elver wrote:
> +Cc RCU
> 
> On Sun, 24 Nov 2024 at 23:47, syzbot
> <syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    42d9e8b7ccdd Merge tag 'powerpc-6.13-1' of git://git.kerne..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10a00778580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=3d7fd5be0e73b8b
> > dashboard link: https://syzkaller.appspot.com/bug?extid=16a19b06125a2963eaee
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/ef231513adc7/disk-42d9e8b7.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/54caaac5960b/vmlinux-42d9e8b7.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/85b5a6566143/bzImage-42d9e8b7.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: assert: race in srcu_get_delay kernel/rcu/srcutree.c:658 [inline]
> > BUG: KCSAN: assert: race in srcu_funnel_gp_start kernel/rcu/srcutree.c:1089 [inline]
> > BUG: KCSAN: assert: race in srcu_gp_start_if_needed+0x808/0x9f0 kernel/rcu/srcutree.c:1339

Hmmm...  All of those are from slow paths, so locking looks to be the
best approach.

A very lightly tested prototype patch is shown below (for which feedback
is most welcome), and thank you all for your testing efforts!

							Thanx, Paul

------------------------------------------------------------------------

commit a955c6a7168f7b204784e4ef7e4db9d017043f73
Author: Paul E. McKenney <paulmck@...nel.org>
Date:   Fri Jan 3 17:04:49 2025 -0800

    srcu: Force synchronization for srcu_get_delay()
    
    Currently, srcu_get_delay() can be called concurrently, for example,
    by a CPU that is the first to request a new grace period and the CPU
    processing the current grace period.  Although concurrent access is
    harmless, it unnecessarily expands the state space.  Additionally,
    all calls to srcu_get_delay() are from slow paths.
    
    This commit therefore protects all calls to srcu_get_delay() with
    ssp->srcu_sup->lock, which is already held on the invocation from the
    srcu_funnel_gp_start() function.  While in the area, this commit also
    adds a lockdep_assert_held() to srcu_get_delay() itself.
    
    Reported-by: syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com
    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7c7304dee6457..a60acc9cf2f32 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -648,6 +648,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 	unsigned long jbase = SRCU_INTERVAL;
 	struct srcu_usage *sup = ssp->srcu_sup;
 
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	if (srcu_gp_is_expedited(ssp))
 		jbase = 0;
 	if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) {
@@ -675,9 +676,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 void cleanup_srcu_struct(struct srcu_struct *ssp)
 {
 	int cpu;
+	unsigned long delay;
 	struct srcu_usage *sup = ssp->srcu_sup;
 
-	if (WARN_ON(!srcu_get_delay(ssp)))
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
+	delay = srcu_get_delay(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
+	if (WARN_ON(!delay))
 		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(ssp)))
 		return; /* Just leak it! */
@@ -1100,7 +1105,9 @@ static bool try_check_zero(struct srcu_struct *ssp, int idx, int trycount)
 {
 	unsigned long curdelay;
 
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	curdelay = !srcu_get_delay(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 
 	for (;;) {
 		if (srcu_readers_active_idx_check(ssp, idx))
@@ -1849,7 +1856,9 @@ static void process_srcu(struct work_struct *work)
 	ssp = sup->srcu_ssp;
 
 	srcu_advance_state(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	curdelay = srcu_get_delay(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	if (curdelay) {
 		WRITE_ONCE(sup->reschedule_count, 0);
 	} else {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ