[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <70e3482b-3181-446c-b3bd-db590008123e@nvidia.com>
Date: Wed, 7 May 2025 12:52:04 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Zqiang <qiang.zhang1211@...il.com>, paulmck@...nel.org,
neeraj.upadhyay@...nel.org, joel@...lfernandes.org, urezki@...il.com,
boqun.feng@...il.com, rcu@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] rcu/nocb: Add Safe checks for access offloaded rdp
On 5/7/2025 12:31 PM, Frederic Weisbecker wrote:
> Le Wed, May 07, 2025 at 12:06:29PM -0400, Joel Fernandes a écrit :
>>
>>
>> On 5/7/2025 7:26 AM, Zqiang wrote:
>>> For built with CONFIG_PROVE_RCU=y and CONFIG_PREEMPT_RT=y kernels,
>>> Disable BH does not change the SOFTIRQ corresponding bits in
>>> preempt_count(), but change current->softirq_disable_cnt, this
>>> resulted in the following splat:
>>>
>>> WARNING: suspicious RCU usage
>>> kernel/rcu/tree_plugin.h:36 Unsafe read of RCU_NOCB offloaded state!
>>> stack backtrace:
>>> CPU: 0 UID: 0 PID: 22 Comm: rcuc/0
>>> Call Trace:
>>> [ 0.407907] <TASK>
>>> [ 0.407910] dump_stack_lvl+0xbb/0xd0
>>> [ 0.407917] dump_stack+0x14/0x20
>>> [ 0.407920] lockdep_rcu_suspicious+0x133/0x210
>>> [ 0.407932] rcu_rdp_is_offloaded+0x1c3/0x270
>>> [ 0.407939] rcu_core+0x471/0x900
>>> [ 0.407942] ? lockdep_hardirqs_on+0xd5/0x160
>>> [ 0.407954] rcu_cpu_kthread+0x25f/0x870
>>> [ 0.407959] ? __pfx_rcu_cpu_kthread+0x10/0x10
>>> [ 0.407966] smpboot_thread_fn+0x34c/0xa50
>>> [ 0.407970] ? trace_preempt_on+0x54/0x120
>>> [ 0.407977] ? __pfx_smpboot_thread_fn+0x10/0x10
>>> [ 0.407982] kthread+0x40e/0x840
>>> [ 0.407990] ? __pfx_kthread+0x10/0x10
>>> [ 0.407994] ? rt_spin_unlock+0x4e/0xb0
>>> [ 0.407997] ? rt_spin_unlock+0x4e/0xb0
>>> [ 0.408000] ? __pfx_kthread+0x10/0x10
>>> [ 0.408006] ? __pfx_kthread+0x10/0x10
>>> [ 0.408011] ret_from_fork+0x40/0x70
>>> [ 0.408013] ? __pfx_kthread+0x10/0x10
>>> [ 0.408018] ret_from_fork_asm+0x1a/0x30
>>> [ 0.408042] </TASK>
>>>
>>> Currently, triggering an rdp offloaded state change need the
>>> corresponding rdp's CPU goes offline, and at this time the rcuc
>>> kthreads has already in parking state. this means the corresponding
>>> rcuc kthreads can safely read offloaded state of rdp while it's
>>> corresponding cpu is online.
>>>
>>> This commit therefore add softirq_count() check for
>>> Preempt-RT kernels.
>>>
>>> Suggested-by: Joel Fernandes <joelagnelf@...dia.com>
>>> Signed-off-by: Zqiang <qiang.zhang1211@...il.com>
>>> ---
>>> kernel/rcu/tree_plugin.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>>> index 003e549f6514..a91b2322a0cd 100644
>>> --- a/kernel/rcu/tree_plugin.h
>>> +++ b/kernel/rcu/tree_plugin.h
>>> @@ -29,7 +29,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
>>> (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
>>> lockdep_is_held(&rdp->nocb_lock) ||
>>> lockdep_is_held(&rcu_state.nocb_mutex) ||
>>> - (!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) &&
>>> + ((!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) || softirq_count()) &&
>>> rdp == this_cpu_ptr(&rcu_data)) ||
>> This looks good to me. Frederic told me he'll further review and give final
>> green signal. Then I'll pull this particular one.
>>
>> One thing I was wondering -- it would be really nice if preemptible() itself
>> checked for softirq_count() by default. Or adding something like a
>> really_preemptible() which checks for both CONFIG_PREEMPT_COUNT and
>> softirq_count() along with preemptible(). I feel like this always comes back to
>> bite us in different ways, and not knowing atomicity complicates various code paths.
>>
>> Maybe a summer holidays project? ;)
>
> I thought about that too but I think this is semantically incorrect.
> In PREEMPT_RT, softirqs are actually preemptible.
You are right, for this usecase it may not be that helpful. However, I do find
it odd that the caller has to check CONFIG_PREEMPT_COUNT before calling
preemptible() above, that should be implemented in the API itself.
I was more broadly referring to the recurring problem of "am I in atomic"
context or "can I be preempted" or "can I sleep", if so do this otherwise do
something else. For instance calling GFP_KERNEL vs GFP_ATOMIC in the memory
allocator. Though sleeping and preemption are perhaps separate concerns. We ran
into this during the kfree_rcu() early days as well.
IIRC we have perhaps too many APIs that do similar things like in_atomic() which
confuses everyone (I don't fully know the current state of all such APIs). Add
PREEMPT_RT to the mix, and now we have more complications. :-/
thanks,
- Joel
Powered by blists - more mailing lists