[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5171d0d-5a4b-75b6-53b4-d1251cacc0ff@codeaurora.org>
Date: Fri, 8 Sep 2017 07:43:21 +0530
From: Prateek Sood <prsood@...eaurora.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: tj@...nel.org, lizefan@...wei.com, cgroups@...r.kernel.org,
mingo@...nel.org, longman@...hat.com, boqun.feng@...il.com,
tglx@...utronix.de, linux-kernel@...r.kernel.org,
sramana@...eaurora.org
Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock
On 09/07/2017 11:15 PM, Peter Zijlstra wrote:
> On Thu, Sep 07, 2017 at 07:26:23PM +0530, Prateek Sood wrote:
>> Remove circular dependency deadlock in a scenario where hotplug of CPU is
>> being done while there is updation in cgroup and cpuset triggered from
>> userspace.
>>
>> Process A => kthreadd => Process B => Process C => Process A
>
>> Process A
>> cpu_subsys_offline();
>> cpu_down();
>> _cpu_down();
>> percpu_down_write(&cpu_hotplug_lock); //held
>> cpuhp_invoke_callback();
>> workqueue_offline_cpu();
>> wq_update_unbound_numa();
>> kthread_create_on_node();
>> wake_up_process(); //wakeup kthreadd
>
> TJ, I'm puzzled, why would we need to spawn new threads to update NUMA
> affinity when taking a CPU out? That doesn't make sense to me, we can
> either shrink the affinity of an existing thread or completely kill of a
> thread if the mask becomes empty. But why spawn a new thread?
>
>> flush_work();
>> wait_for_completion();
>>
>
> Yes, inverting cpuset and hotplug would break that chain, but I'm still
> wondering why workqueue needs to spawn threads on CPU down.
>
Thanks for the comments Peter
You rightly mentioned that a new thread will not be spawn
while updating NUMA affinity when taking a CPU out.
While a CPU is made offline, attempt is made to unbind per-cpu
worker for CPU going down. This is done by queuing unbind work
to system_highpri_wq. It results in an attempt to create one
bounded worker thread as there is none.
wait_for_completion() in flush_work() waits for unbinding to
finish for CPU going down.
Process A
cpu_subsys_offline();
cpu_down();
_cpu_down();
percpu_down_write(&cpu_hotplug_lock); //held
cpuhp_invoke_callback();
workqueue_offline_cpu();
queue_work_on(system_highpri_wq);
__queue_work();
insert_work();
wake_up_worker(); //pool->nr_running = 0
flush_work();
wait_for_completion();
worker_thread();
need_more_worker(); // returns true
manage_workers();
maybe_create_worker();
create_worker();
kthread_create_on_node();
wake_up_process(kthreadd_task);
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project
Powered by blists - more mailing lists