lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 8 Sep 2017 07:43:21 +0530
From:   Prateek Sood <prsood@...eaurora.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     tj@...nel.org, lizefan@...wei.com, cgroups@...r.kernel.org,
        mingo@...nel.org, longman@...hat.com, boqun.feng@...il.com,
        tglx@...utronix.de, linux-kernel@...r.kernel.org,
        sramana@...eaurora.org
Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock

On 09/07/2017 11:15 PM, Peter Zijlstra wrote:
> On Thu, Sep 07, 2017 at 07:26:23PM +0530, Prateek Sood wrote:
>> Remove circular dependency deadlock in a scenario where hotplug of CPU is
>> being done while there is updation in cgroup and cpuset triggered from
>> userspace.
>>
>> Process A => kthreadd => Process B => Process C => Process A
> 
>> Process A
>> cpu_subsys_offline();
>>   cpu_down();
>>     _cpu_down();
>>       percpu_down_write(&cpu_hotplug_lock); //held
>>       cpuhp_invoke_callback();
>>         workqueue_offline_cpu();
>>           wq_update_unbound_numa();
>>             kthread_create_on_node();
>>               wake_up_process();  //wakeup kthreadd
> 
> TJ, I'm puzzled, why would we need to spawn new threads to update NUMA
> affinity when taking a CPU out? That doesn't make sense to me, we can
> either shrink the affinity of an existing thread or completely kill of a
> thread if the mask becomes empty. But why spawn a new thread?
> 
>>           flush_work();
>>           wait_for_completion();
>>

> 
> Yes, inverting cpuset and hotplug would break that chain, but I'm still
> wondering why workqueue needs to spawn threads on CPU down.
> 

Thanks for the comments Peter

You rightly mentioned that a new thread will not be spawn
while updating NUMA affinity when taking a CPU out.

While a CPU is made offline, attempt is made to unbind per-cpu
worker for CPU going down. This is done by queuing unbind work
to system_highpri_wq. It results in an attempt to create one
bounded worker thread as there is none. 

wait_for_completion() in flush_work() waits for unbinding to
finish for CPU going down.

Process A
cpu_subsys_offline();
   cpu_down();
     _cpu_down();
       percpu_down_write(&cpu_hotplug_lock); //held
       cpuhp_invoke_callback();
         workqueue_offline_cpu();
           queue_work_on(system_highpri_wq);
             __queue_work();
               insert_work();
                 wake_up_worker(); //pool->nr_running = 0
           flush_work();
		   wait_for_completion();
		   
	   
worker_thread();
  need_more_worker(); // returns true
  manage_workers();
    maybe_create_worker();
	  create_worker();
	    kthread_create_on_node();
		  wake_up_process(kthreadd_task);



-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ