linux-kernel - Re: [PATCH -mm] cpuhotplug: introduce try_get_online

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090609234757.GH16117@linux.vnet.ibm.com>
Date:	Tue, 9 Jun 2009 16:47:58 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>, ego@...ibm.com,
	rusty@...tcorp.com.au, mingo@...e.hu, linux-kernel@...r.kernel.org,
	peterz@...radead.org, oleg@...hat.com, dipankar@...ibm.com
Subject: Re: [PATCH -mm] cpuhotplug: introduce try_get_online_cpus() take 3

On Tue, Jun 09, 2009 at 12:34:38PM -0700, Andrew Morton wrote:
> On Tue, 09 Jun 2009 20:07:09 +0800
> Lai Jiangshan <laijs@...fujitsu.com> wrote:
> 
> > get_online_cpus() is a typically coarsely granular lock.
> > It's a source of ABBA deadlock.
> > 
> > Thanks to the CPU notifiers, Some subsystem's global lock will
> > be required after cpu_hotplug.lock. Subsystem's global lock
> > is coarsely granular lock too, thus a lot's of lock in kernel
> > should be required after cpu_hotplug.lock(if we need
> > cpu_hotplug.lock held too)
> > 
> > Otherwise it may come to a ABBA deadlock like this:
> > 
> > thread 1                                      |        thread 2
> > _cpu_down()                                   |  Lock a-kernel-lock.
> >   cpu_hotplug_begin()                         |
> >     down_write(&cpu_hotplug.lock)             |
> >   __raw_notifier_call_chain(CPU_DOWN_PREPARE) |  get_online_cpus()
> > ------------------------------------------------------------------------
> >     Lock a-kernel-lock.(wait thread2)         |    down_read(&cpu_hotplug.lock)
> >                                                    (wait thread 1)
> 
> Confused.  cpu_hotplug_begin() doesn't do
> down_write(&cpu_hotplug.lock).  If it _were_ to do that then yes, we'd
> be vulnerable to the above deadlock.

The current implementation is a bit more complex.  If you hold a kernel
mutex across get_online_cpus() and also acquire that same kernel mutex
in a hotplug notifier that permits sleeping, I believe that you really
can get a deadlock as follows:

Task 1					 |	Task 2
					 | mutex_lock(&mylock);
cpu_hotplug_begin()			 |
   mutex_lock(&cpu_hotplug.lock);	 |
   [assume cpu_hotplug.refcount == 0]	 | get_online_cpus()
---------------------------------------------------------------------------
   mutex_lock(&mylock);			 |   mutex_lock(&cpu_hotplug.lock);


That said, when I look at the raw_notifier_call_chain() and
unregister_cpu_notifier() code paths, it is not obvious to me that they
exclude each other or otherwise protect the cpu_chain list...

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/