[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090615040409.GA30979@in.ibm.com>
Date: Mon, 15 Jun 2009 09:34:09 +0530
From: Gautham R Shenoy <ego@...ibm.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@...fujitsu.com>,
Andrew Morton <akpm@...ux-foundation.org>,
rusty@...tcorp.com.au, mingo@...e.hu, linux-kernel@...r.kernel.org,
peterz@...radead.org, oleg@...hat.com, dipankar@...ibm.com
Subject: Re: [PATCH -mm resend] cpuhotplug: introduce try_get_online_cpus()
take 3
On Thu, Jun 11, 2009 at 11:50:15AM -0700, Paul E. McKenney wrote:
> On Thu, Jun 11, 2009 at 04:41:42PM +0800, Lai Jiangshan wrote:
> > Andrew Morton wrote:
> > >
> > > I still think we should really avoid having to do this. trylocks are
> > > nasty things.
> > >
> > > Looking at the above, one would think that a correct fix would be to fix
> > > the bug in "thread 2": take the locks in the correct order? As
> > > try_get_online_cpus() doesn't actually have any callers, it's hard to
> > > take that thought any further.
> >
> > Sometimes, we can not reorder the locks' order.
> > try_get_online_cpus() is really needless when no one uses it.
> >
> > Paul's expedited RCU V7 may need it:
> > http://lkml.org/lkml/2009/5/22/332
> >
> > So this patch can be omitted when Paul does not use it.
> > It's totally OK for me.
>
> Although my patch does not need it in and of itself, if someone were
> to hold a kernel mutex across synchronize_sched_expedited(), and also
> acquire that same kernel mutex in a hotplug notifier, the deadlock that
> Lai calls out would occur.
>
> Even if no one uses synchronize_sched_expedited() in this manner, I feel
> that it is good to explore the possibility of dealing with it. As
> Andrew Morton pointed out, CPU-hotplug locking is touchy, so on-the-fly
> fixes are to be avoided if possible.
Agreed. Though I like the atomic refcount version of
get_online_cpus()/put_online_cpus() that Lai has proposed.
Anyways, to quote the need for try_get_online_cpus() when it was
proposed last year, it was to be used in worker thread context.
Because in those times we could not do a get_online_cpus() from
the worker thread context fearing the follwing deadlock during
a cpu-hotplug.
Thread 1:(cpu_offline) | Thread 2 ( worker_thread)
-----------------------------------------------------------------------
cpu_hotplug_begin(); |
. |
. | get_online_cpus(); /*Blocks */
. |
. |
CPU_DEAD: |
workqueue_cpu_callback(); |
cleanup_workqueue_thread() |
/* Waits for worker thread
* to finish.
* Hence a deadlock.
*/
This was fixed by introducing the CPU_POST_DEAD event, the notification
>
> Thanx, Paul
--
Thanks and Regards
gautham
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists