[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d98bb24d-b8cd-b00b-57c3-d96dae57ad5b@redhat.com>
Date: Tue, 19 Apr 2022 14:34:37 +0200
From: David Hildenbrand <david@...hat.com>
To: Joel Savitz <jsavitz@...hat.com>, linux-kernel@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <valentin.schneider@....com>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Yuan ZhaoXiong <yuanzhaoxiong@...du.com>,
Baokun Li <libaokun1@...wei.com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
YueHaibing <yuehaibing@...wei.com>,
Randy Dunlap <rdunlap@...radead.org>,
David Hildenbrand <dhildenb@...hat.com>
Subject: Re: [RFC PATCH] kernel/cpu: restart cpu_up when hotplug is disabled
On 18.04.22 21:54, Joel Savitz wrote:
> The cpu hotplug path may be utilized while hotplug is disabled for a
> brief moment leading to failures. As an example, attempts to perform
> cpu hotplug by userspace soon after boot may race with pci_device_probe
> leading to inconsistent results.
You might want to extend a bit in which situation we observed that issue
fairly reliably.
When restricting the number of boot cpus on the kernel cmdline, e.g.,
via "maxcpus=2", udev will find the offline cpus when enumerating all
cpus and try onlining them. Due to the race, onlining of some cpus fails
e.g., when racing with pci_device_probe().
While teaching udev to not online coldplugged CPUs when "maxcpus" was
specified ("policy"), it revealed the underlying issue that onlining a
CPU can fail with -EBUSY in corner cases when cpu hotplug is temporarily
disabled.
>
> Proposed idea:
> Call restart_syscall instead of returning -EBUSY since
> cpu_hotplug_disabled seems to only have a positive value
> for short, temporary amounts of time.
>
> Does anyone see any serious problems with this?
>
> Signed-off-by: Joel Savitz <jsavitz@...hat.com>
> ---
> kernel/cpu.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 5797c2a7a93f..2992c7d1d24e 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -35,6 +35,7 @@
> #include <linux/percpu-rwsem.h>
> #include <linux/cpuset.h>
> #include <linux/random.h>
> +#include <linux/delay.h>
>
> #include <trace/events/power.h>
> #define CREATE_TRACE_POINTS
> @@ -1401,7 +1402,9 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state target)
> cpu_maps_update_begin();
>
> if (cpu_hotplug_disabled) {
> - err = -EBUSY;
> + /* avoid busy looping (5ms of sleep should be enough) */
> + msleep(5);
> + err = restart_syscall();
It's worth noting that we use the same approach in
lock_device_hotplug_sysfs(). It's far from perfect I would say, but we
really wanted to avoid letting user space having to deal with retry logic.
For example, while memory onlining can fail with -EBUSY, it's not
expected to fail during memory onlining (we only fail in very rare
cases, when a memory notifier fails -- for example when kasan fails to
allocate memory).
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists