[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiKQg6fxn6OxAmr5@slm.duckdns.org>
Date: Fri, 19 Apr 2024 05:40:51 -1000
From: Tejun Heo <tj@...nel.org>
To: Sven Schnelle <svens@...ux.ibm.com>
Cc: Lai Jiangshan <jiangshanlai@...il.com>, linux-kernel@...r.kernel.org,
Heiko Carstens <hca@...ux.ibm.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] workqueue: fix selection of wake_cpu in kick_pool()
Hello, Sven.
On Fri, Apr 19, 2024 at 10:27:05AM +0200, Sven Schnelle wrote:
> > Probably by wrapping determining the wake_cpu and the wake_up inside
> > cpu_read_lock() section.
>
> Do you mean rcu_read_lock()? cpus_read_lock() takes a mutex, and the
> crash happens in softirq context - so cpus_read_lock() can't be the
> correct lock.
I meant cpus_read_lock() but yeah we can't use that here.
> If i read the code correctly, cpu hotplug uses stop_machine_cpuslocked()
> - so rcu_read_lock() should be sufficient for non-atomic context.
>
> Looking at the backtrace the crash is actually happening in
> arch_vpu_is_preempted(). I don't know the semantics of that function,
> whether it is ok to call it for offline CPUs, or whether the calling
> code should make sure that the cpu is online (which would be my guess).
>
> Following the backtrace from my initial mail, I can't find a place where
> a check is done whether p->wake_cpu is actually online. Eventually
> available_idle_cpu() is calling vcpu_is_preempted(). I wonder whether
> available_idle_cpu() should do a cpu_online() check right at the
> beginning?
Yeah, adding a cpu_online() test there makes more sense to me.
> Adding Peter to CC, he probably knows.
Peter?
Thanks.
--
tejun
Powered by blists - more mailing lists