[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <751e87e0-4113-4887-8436-c4a69c82abcb@paulmck-laptop>
Date: Fri, 24 Jan 2025 13:37:42 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Waiman Long <llong@...hat.com>
Cc: John Stultz <jstultz@...gle.com>, Thomas Gleixner <tglx@...utronix.de>,
Stephen Boyd <sboyd@...nel.org>, Feng Tang <feng.tang@...el.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH 2/2] clocksource: Defer preempt_disable() after
clocksource_verify_choose_cpus()
On Fri, Jan 24, 2025 at 03:41:45PM -0500, Waiman Long wrote:
> On 1/24/25 3:37 PM, Paul E. McKenney wrote:
> > On Fri, Jan 24, 2025 at 12:25:19PM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 24, 2025 at 01:59:23PM -0500, Waiman Long wrote:
> > > > The following bug report happened in a PREEMPT_RT kernel.
> > > >
> > > > [ 30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > > > [ 30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
> > > > [ 30.962673] preempt_count: 1, expected: 0
> > > > [ 30.962676] RCU nest depth: 0, expected: 0
> > > > [ 30.962680] 3 locks held by kwatchdog/2012:
> > > > [ 30.962684] #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
> > > > [ 30.967703] #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
> > > > [ 30.972774] #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
> > > > [ 30.977827] Preemption disabled at:
> > > > [ 30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
> > > > [ 30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
> > > > [ 30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
> > > > [ 30.982846] Call Trace:
> > > > [ 30.982850] <TASK>
> > > > [ 30.983821] dump_stack_lvl+0x57/0x81
> > > > [ 30.983821] __might_resched.cold+0xf4/0x12f
> > > > [ 30.983824] rt_spin_lock+0x4c/0x100
> > > > [ 30.988833] get_random_u32+0x4f/0x110
> > > > [ 30.988833] clocksource_verify_choose_cpus+0xab/0x1a0
> > > > [ 30.988833] clocksource_verify_percpu.part.0+0x6b/0x330
> > > > [ 30.993894] __clocksource_watchdog_kthread+0x193/0x1a0
> > > > [ 30.993898] clocksource_watchdog_kthread+0x18/0x50
> > > > [ 30.993898] kthread+0x114/0x140
> > > > [ 30.993898] ret_from_fork+0x2c/0x50
> > > > [ 31.002864] </TASK>
> > > >
> > > > It is due to the fact that get_random_u32() is called in
> > > > clocksource_verify_choose_cpus() with preemption disabled.
> > > > If crng_ready() is true by the time get_random_u32() is called, The
> > > > batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
> > > > it is a rtmutex and we can't acquire it with preemption disabled.
> > > >
> > > > To avoid this problem, we can't call get_random_u32() with preemption
> > > > disabled. However, smp_processor_id() has to be called with preemption
> > > > disabled though and we have to exclude the current CPU from the
> > > > cpus_chosen list to be tested.
> > > >
> > > > Extract current CPU removal code out from
> > > > clocksource_verify_choose_cpus() and defer the preempt_disable()
> > > > call to after clocksource_verify_choose_cpus() and before
> > > > current CPU removal. Also use raw_smp_processor_id() in
> > > > clocksource_verify_choose_cpus().
> > > >
> > > > Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
> > > > Signed-off-by: Waiman Long <longman@...hat.com>
> > > Good catch!
> > >
> > > But we don't need cryptographically secure random numbers (or blistering
> > > speed) here. Substituting something like torture_random()%nr_cpu_ids
> > > for get_random_u32_below() work?
> > I suppose I should add my concern... If we don't have preemption disabled
> > across this code, we cannot reliably avoid attempting to IPI ourselves.
>
> Does the CPU choosing process itsself needs to have preemption disabled? I
> thought it was because of need to use smp_processor_id() and have the
> current CPU excluded. Preemption is still disabled after that.
If the function is migrated from one CPU to another, the check against
raw_smp_processor_id() doesn't mean much. At that point, you might
as well just rely in the calling function clearing it.
And the extra preemption happens only in an error condition were extra
debugging is enabled. Plus clocksource_verify_choose_cpus() should be
pretty quick. So do we really care about the additional disabling of
preemption?
Thanx, Paul
Powered by blists - more mailing lists