[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7adc35fd-2c88-444f-93d4-45fc1a1d7369@nvidia.com>
Date: Fri, 9 May 2025 15:14:06 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: paulmck@...nel.org, Zqiang <qiang.zhang1211@...il.com>
Cc: frederic@...nel.org, neeraj.upadhyay@...nel.org, joel@...lfernandes.org,
urezki@...il.com, boqun.feng@...il.com, rcu@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix rcutorture_one_extend_check() splat in RT
kernels
On 5/7/2025 5:04 PM, Paul E. McKenney wrote:
> On Wed, May 07, 2025 at 07:26:03PM +0800, Zqiang wrote:
>> For built with CONFIG_PREEMPT_RT=y kernels, running rcutorture
>> tests resulted in the following splat:
>>
>> [ 68.797425] rcutorture_one_extend_check during change: Current 0x1 To add 0x1 To remove 0x0 preempt_count() 0x0
>> [ 68.797533] WARNING: CPU: 2 PID: 512 at kernel/rcu/rcutorture.c:1993 rcutorture_one_extend_check+0x419/0x560 [rcutorture]
>> [ 68.797601] Call Trace:
>> [ 68.797602] <TASK>
>> [ 68.797619] ? lockdep_softirqs_off+0xa5/0x160
>> [ 68.797631] rcutorture_one_extend+0x18e/0xcc0 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
>> [ 68.797646] ? local_clock+0x19/0x40
>> [ 68.797659] rcu_torture_one_read+0xf0/0x280 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
>> [ 68.797678] ? __pfx_rcu_torture_one_read+0x10/0x10 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
>> [ 68.797804] ? __pfx_rcu_torture_timer+0x10/0x10 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
>> [ 68.797815] rcu-torture: rcu_torture_reader task started
>> [ 68.797824] rcu-torture: Creating rcu_torture_reader task
>> [ 68.797824] rcu_torture_reader+0x238/0x580 [rcutorture 2466dbd2ff34dbaa36049cb323a80c3306ac997c]
>> [ 68.797836] ? kvm_sched_clock_read+0x15/0x30
>>
>> Disable BH does not change the SOFTIRQ corresponding bits in
>> preempt_count() for RT kernels, this commit therefore use
>> softirq_count() to check the if BH is disabled.
>>
>> Signed-off-by: Zqiang <qiang.zhang1211@...il.com>
>> ---
>> kernel/rcu/rcutorture.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
>> index 373c65a6e103..ef439569f979 100644
>> --- a/kernel/rcu/rcutorture.c
>> +++ b/kernel/rcu/rcutorture.c
>> @@ -471,7 +471,7 @@ rcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
>> !(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) {
>> started = cur_ops->get_gp_seq();
>> ts = rcu_trace_clock_local();
>> - if (preempt_count() & (SOFTIRQ_MASK | HARDIRQ_MASK))
>> + if ((preempt_count() & HARDIRQ_MASK) || softirq_count())
>> longdelay_ms = 5; /* Avoid triggering BH limits. */
>> mdelay(longdelay_ms);
>> rtrsp->rt_delay_ms = longdelay_ms;
>> @@ -1990,7 +1990,7 @@ static void rcutorture_one_extend_check(char *s, int curstate, int new, int old,
>> return;
>>
>> WARN_ONCE((curstate & (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH)) &&
>> - !(preempt_count() & SOFTIRQ_MASK), ROEC_ARGS);
>> + !softirq_count(), ROEC_ARGS);
>> WARN_ONCE((curstate & (RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED)) &&
>> !(preempt_count() & PREEMPT_MASK), ROEC_ARGS);
>> WARN_ONCE(cur_ops->readlock_nesting &&
>> @@ -2004,7 +2004,7 @@ static void rcutorture_one_extend_check(char *s, int curstate, int new, int old,
>>
>> WARN_ONCE(cur_ops->extendables &&
>> !(curstate & (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH)) &&
>> - (preempt_count() & SOFTIRQ_MASK), ROEC_ARGS);
>> + softirq_count(), ROEC_ARGS);
> Given that softirq_count is defined as (preempt_count() & SOFTIRQ_MASK)
> for CONFIG_PREEMPT_RT=n, the above don't change anything in that case,
> so good. For CONFIG_PREEMPT_RT=y, softirq_count() looks to be the way
> to check BH-disable nesting, so that is good as well.
>
>> /*
>> * non-preemptible RCU in a preemptible kernel uses preempt_disable()
>> @@ -2025,6 +2025,9 @@ static void rcutorture_one_extend_check(char *s, int curstate, int new, int old,
>> if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
>> mask |= RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
>>
>> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && softirq_count())
>> + mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> At this point in the code, we are complaining if something is disabled
> when it is not supposed to be. So if I understand this correctly, this
> added code would suppress complaints (but only in CONFIG_PREEMPT_RT=y
> kernels) when there is an unexpected rcu_read_lock() in the case where
> there was either local_bh_disable() or rcu_read_lock_bh() in effect.
>
> So I would expect that the CONFIG_PREEMPT_RT=y version of both
> local_bh_disable() and rcu_read_lock_bh() would contain rcu_read_lock().
>
> And in fact, rcu_read_lock_bh() invokes local_bh_disable(),
> which, for CONFIG_PREEMPT_RT=y invokes __local_bh_disable_ip() in
> kernel/softirq.c, which on the outermost local_bh_disabe() really does
> invoke rcu_read_lock().
>
> So this one looks good as well!
>
> Reviewed-by: Paul E. McKenney <paulmck@...nel.org>
It is a fix so applying with the review tag, for 6.16, thanks!
- Joel
Powered by blists - more mailing lists