[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3256f9c1-dccd-5995-5b14-afaae281be90@huawei.com>
Date: Thu, 24 Aug 2023 15:25:22 +0800
From: Yu Liao <liaoyu15@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Vincent Guittot <vincent.guittot@...aro.org>
CC: Xiongfeng Wang <wangxiongfeng2@...wei.com>, <vschneid@...hat.com>,
Phil Auld <pauld@...hat.com>, <vdonnefort@...gle.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Wei Li <liwei391@...wei.com>, <zhangqiao22@...wei.com>,
Peter Zijlstra <peterz@...radead.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [Question] report a race condition between CPU hotplug state
machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling
On 2023/8/23 18:14, Thomas Gleixner wrote:
> Subject: cpu/hotplug: Prevent self deadlock on CPU hot-unplug
> From: Thomas Gleixner <tglx@...utronix.de>
> Date: Wed, 23 Aug 2023 10:47:02 +0200
>
> Xiongfeng reported and debugged a self deadlock of the task which initiates
> and controls a CPU hot-unplug operation vs. the CFS bandwidth timer.
>
> CPU1 CPU2
>
> T1 sets cfs_quota
> starts hrtimer cfs_bandwidth 'period_timer'
> T1 is migrated to CPU2
> T1 initiates offlining of CPU1
> Hotplug operation starts
> ...
> 'period_timer' expires and is re-enqueued on CPU1
> ...
> take_cpu_down()
> CPU1 shuts down and does not handle timers
> anymore. They have to be migrated in the
> post dead hotplug steps by the control task.
>
> T1 runs the post dead offline operation
> T1 is scheduled out
> T1 waits for 'period_timer' to expire
>
> T1 waits there forever if it is scheduled out before it can execute the hrtimer
> offline callback hrtimers_dead_cpu().
>
> Cure this by delegating the hotplug control operation to a worker thread on
> an online CPU. This takes the initiating user space task, which might be
> affected by the bandwidth timer, completely out of the picture.
>
> Reported-by: Xiongfeng Wang <wangxiongfeng2@...wei.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> Link: https://lore.kernel.org/lkml/8e785777-03aa-99e1-d20e-e956f5685be6@huawei.com
> ---
> kernel/cpu.c | 24 +++++++++++++++++++++++-
> 1 file changed, 23 insertions(+), 1 deletion(-)
>
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1467,8 +1467,22 @@ static int __ref _cpu_down(unsigned int
> return ret;
> }
>
> +struct cpu_down_work {
> + unsigned int cpu;
> + enum cpuhp_state target;
> +};
> +
> +static long __cpu_down_maps_locked(void *arg)
> +{
> + struct cpu_down_work *work = arg;
> +
> + return _cpu_down(work->cpu, 0, work->target);
> +}
> +
> static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
> {
> + struct cpu_down_work work = { .cpu = cpu, .target = target, };
> +
> /*
> * If the platform does not support hotplug, report it explicitly to
> * differentiate it from a transient offlining failure.
> @@ -1477,7 +1491,15 @@ static int cpu_down_maps_locked(unsigned
> return -EOPNOTSUPP;
> if (cpu_hotplug_disabled)
> return -EBUSY;
> - return _cpu_down(cpu, 0, target);
> +
> + /*
> + * Ensure that the control task does not run on the to be offlined
> + * CPU to prevent a deadlock against cfs_b->period_timer.
> + */
> + cpu = cpumask_any_but(cpu_online_mask, cpu);
> + if (cpu >= nr_cpu_ids)
> + return -EBUSY;
> + return work_on_cpu(cpu, __cpu_down_maps_locked, &work);
> }
>
> static int cpu_down(unsigned int cpu, enum cpuhp_state target)
Thanks for the patch. Tested in v6.5-rc5 with test script provided by
Xiongfeng, this patch works.
Tested-by: Yu Liao <liaoyu15@...wei.com>
Best regards,
Yu
Powered by blists - more mailing lists