[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <77c8887d-e69c-4554-9c9f-c9d755c7aff5@paulmck-laptop>
Date: Fri, 3 Jan 2025 17:14:46 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: syzbot <syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com>,
acme@...nel.org, adrian.hunter@...el.com,
alexander.shishkin@...ux.intel.com, irogers@...gle.com,
jolsa@...nel.org, kan.liang@...ux.intel.com,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, mark.rutland@....com,
mhiramat@...nel.org, mingo@...hat.com, namhyung@...nel.org,
oleg@...hat.com, peterz@...radead.org,
syzkaller-bugs@...glegroups.com, RCU <rcu@...r.kernel.org>
Subject: Re: [syzbot] [perf?] [trace?] KCSAN: assert: race in
srcu_gp_start_if_needed
On Sun, Nov 24, 2024 at 11:48:46PM +0100, Marco Elver wrote:
> +Cc RCU
>
> On Sun, 24 Nov 2024 at 23:47, syzbot
> <syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 42d9e8b7ccdd Merge tag 'powerpc-6.13-1' of git://git.kerne..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10a00778580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=3d7fd5be0e73b8b
> > dashboard link: https://syzkaller.appspot.com/bug?extid=16a19b06125a2963eaee
> > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/ef231513adc7/disk-42d9e8b7.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/54caaac5960b/vmlinux-42d9e8b7.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/85b5a6566143/bzImage-42d9e8b7.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: assert: race in srcu_get_delay kernel/rcu/srcutree.c:658 [inline]
> > BUG: KCSAN: assert: race in srcu_funnel_gp_start kernel/rcu/srcutree.c:1089 [inline]
> > BUG: KCSAN: assert: race in srcu_gp_start_if_needed+0x808/0x9f0 kernel/rcu/srcutree.c:1339
Hmmm... All of those are from slow paths, so locking looks to be the
best approach.
A very lightly tested prototype patch is shown below (for which feedback
is most welcome), and thank you all for your testing efforts!
Thanx, Paul
------------------------------------------------------------------------
commit a955c6a7168f7b204784e4ef7e4db9d017043f73
Author: Paul E. McKenney <paulmck@...nel.org>
Date: Fri Jan 3 17:04:49 2025 -0800
srcu: Force synchronization for srcu_get_delay()
Currently, srcu_get_delay() can be called concurrently, for example,
by a CPU that is the first to request a new grace period and the CPU
processing the current grace period. Although concurrent access is
harmless, it unnecessarily expands the state space. Additionally,
all calls to srcu_get_delay() are from slow paths.
This commit therefore protects all calls to srcu_get_delay() with
ssp->srcu_sup->lock, which is already held on the invocation from the
srcu_funnel_gp_start() function. While in the area, this commit also
adds a lockdep_assert_held() to srcu_get_delay() itself.
Reported-by: syzbot+16a19b06125a2963eaee@...kaller.appspotmail.com
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7c7304dee6457..a60acc9cf2f32 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -648,6 +648,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
unsigned long jbase = SRCU_INTERVAL;
struct srcu_usage *sup = ssp->srcu_sup;
+ lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
if (srcu_gp_is_expedited(ssp))
jbase = 0;
if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) {
@@ -675,9 +676,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
void cleanup_srcu_struct(struct srcu_struct *ssp)
{
int cpu;
+ unsigned long delay;
struct srcu_usage *sup = ssp->srcu_sup;
- if (WARN_ON(!srcu_get_delay(ssp)))
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
+ delay = srcu_get_delay(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
+ if (WARN_ON(!delay))
return; /* Just leak it! */
if (WARN_ON(srcu_readers_active(ssp)))
return; /* Just leak it! */
@@ -1100,7 +1105,9 @@ static bool try_check_zero(struct srcu_struct *ssp, int idx, int trycount)
{
unsigned long curdelay;
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
curdelay = !srcu_get_delay(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
for (;;) {
if (srcu_readers_active_idx_check(ssp, idx))
@@ -1849,7 +1856,9 @@ static void process_srcu(struct work_struct *work)
ssp = sup->srcu_ssp;
srcu_advance_state(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
curdelay = srcu_get_delay(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
if (curdelay) {
WRITE_ONCE(sup->reschedule_count, 0);
} else {
Powered by blists - more mailing lists