lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 Oct 2017 10:05:12 -0400
From:   Waiman Long <longman@...hat.com>
To:     Prateek Sood <prsood@...eaurora.org>, peterz@...radead.org,
        tj@...nel.org, lizefan@...wei.com, mingo@...nel.org,
        boqun.feng@...il.com, tglx@...utronix.de
Cc:     cgroups@...r.kernel.org, sramana@...eaurora.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock

On 10/26/2017 07:52 AM, Prateek Sood wrote:
> Remove circular dependency deadlock in a scenario where hotplug of CPU is
> being done while there is updation in cgroup and cpuset triggered from
> userspace.
>
> Process A => kthreadd => Process B => Process C => Process A
>
> Process A
> cpu_subsys_offline();
>   cpu_down();
>     _cpu_down();
>       percpu_down_write(&cpu_hotplug_lock); //held
>       cpuhp_invoke_callback();
>         workqueue_offline_cpu();
>           wq_update_unbound_numa();
>             kthread_create_on_node();
>               wake_up_process();  //wakeup kthreadd
>           flush_work();
>           wait_for_completion();
>
> kthreadd
> kthreadd();
>   kernel_thread();
>     do_fork();
>       copy_process();
>         percpu_down_read(&cgroup_threadgroup_rwsem);
>           __rwsem_down_read_failed_common(); //waiting
>
> Process B
> kernfs_fop_write();
>   cgroup_file_write();
>     cgroup_procs_write();
>       percpu_down_write(&cgroup_threadgroup_rwsem); //held
>       cgroup_attach_task();
>         cgroup_migrate();
>           cgroup_migrate_execute();
>             cpuset_can_attach();
>               mutex_lock(&cpuset_mutex); //waiting
>
> Process C
> kernfs_fop_write();
>   cgroup_file_write();
>     cpuset_write_resmask();
>       mutex_lock(&cpuset_mutex); //held
>       update_cpumask();
>         update_cpumasks_hier();
>           rebuild_sched_domains_locked();
>             get_online_cpus();
>               percpu_down_read(&cpu_hotplug_lock); //waiting
>
> Eliminating deadlock by reversing the locking order for cpuset_mutex and
> cpu_hotplug_lock.

General comments:

Please add a version number of your patch. I have seen multiple versions
of this patch and have lost track how many are there as there is no
version number information.  In addition, there are changes beyond just
swapping the lock order and they are not documented in this change log.
I would like to see you discuss about those additional changes here as well.

>  void rebuild_sched_domains(void)
>  {
> +	cpus_read_lock();
>  	mutex_lock(&cpuset_mutex);
> -	rebuild_sched_domains_locked();
> +	rebuild_sched_domains_cpuslocked();
>  	mutex_unlock(&cpuset_mutex);
> +	cpus_read_unlock();
>  }

I saw a lot of instances where cpus_read_lock() and mutex_lock() come
together. Maybe some new lock/unlock helper functions may help.

> @@ -2356,25 +2354,29 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
>  	}
>  
>  	/* rebuild sched domains if cpus_allowed has changed */
> -	if (cpus_updated || force_rebuild) {
> -		force_rebuild = false;
> -		rebuild_sched_domains();
> +	if (cpus_updated) {
> +		if (use_cpu_hp_lock)
> +			rebuild_sched_domains();
> +		else {
> +			/* When called during cpu hotplug cpu_hotplug_lock
> +			 * is held by the calling thread, not
> +			 * not cpuhp_thread_fun
> +			 */

??? The comment is not clear.

Cheers,
Longman

Powered by blists - more mailing lists