[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh35gzj77s.mognet@vschneid.remote.csb>
Date: Tue, 24 May 2022 16:11:51 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Phil Auld <pauld@...hat.com>, linux-kernel@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] cpuhp: make target_store() a nop when target == state
On 23/05/22 10:47, Phil Auld wrote:
> writing the current state back into hotplug/target calls cpu_down()
> which will set cpu dying even when it isn't and then nothing will
> ever clear it. A stress test that reads values and writes them back
> for all cpu device files in sysfs will trigger the BUG() in
> select_fallback_rq once all cpus are marked as dying.
>
> kernel/cpu.c::target_store()
> ...
> if (st->state < target)
> ret = cpu_up(dev->id, target);
> else
> ret = cpu_down(dev->id, target);
>
> cpu_down() -> cpu_set_state()
> bool bringup = st->state < target;
> ...
> if (cpu_dying(cpu) != !bringup)
> set_cpu_dying(cpu, !bringup);
>
> Make this safe by catching the case where target == state
> and bailing early.
>
> Signed-off-by: Phil Auld <pauld@...hat.com>
> ---
>
> Yeah, I know... don't do that. But it's still messy.
>
> !< != >
>
> kernel/cpu.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index d0a9aa0b42e8..8a71b1149c60 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2302,6 +2302,9 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,
> return -EINVAL;
> #endif
>
> + if (target == st->state)
> + return count;
> +
The current checks are against static boundaries, this has to compare
against st->state - AFAICT this could race with another hotplug operation
to the same CPU, e.g.
CPU42.cpuhp_state
->state == CPUHP_AP_SCHED_STARTING
->target == CPUHP_ONLINE
<write CPUHP_ONLINE via sysfs, OK because current state != CPUHP_ONLINE>
CPU42.cpuhp_state == CPUHP_ONLINE
<issues ensue>
_cpu_up() has:
/*
* The caller of cpu_up() might have raced with another
* caller. Nothing to do.
*/
if (st->state >= target)
goto out;
Looks like we want an equivalent in _cpu_down(), what do you think?
> ret = lock_device_hotplug_sysfs();
> if (ret)
> return ret;
> --
> 2.18.0
Powered by blists - more mailing lists