[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070109150351.GD9563@osiris.boeblingen.de.ibm.com>
Date: Tue, 9 Jan 2007 16:03:51 +0100
From: Heiko Carstens <heiko.carstens@...ibm.com>
To: Srivatsa Vaddagiri <vatsa@...ibm.com>
Cc: Benjamin Gilbert <bgilbert@...cmu.edu>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Gautham shenoy <ego@...ibm.com>, Andrew Morton <akpm@...l.org>
Subject: Re: Failure to release lock after CPU hot-unplug canceled
On Tue, Jan 09, 2007 at 05:57:40PM +0530, Srivatsa Vaddagiri wrote:
> On Tue, Jan 09, 2007 at 01:17:38PM +0100, Heiko Carstens wrote:
> > missing in kernel cpu.c in _cpu_down() in case CPU_DOWN_PREPARE
> > returned with NOTIFY_BAD. However... this reveals that there is just a
> > more fundamental problem.
> >
> > The workqueue code grabs a lock on CPU_[UP|DOWN]_PREPARE and releases it
> > again on CPU_DOWN_FAILED/CPU_UP_CANCELED. If something in the callchain
> > returns NOTIFY_BAD the rest of the entries in the callchain won't be
> > called anymore. But DOWN_FAILED/UP_CANCELED will be called for every
> > entry.
> > So we might even end up with a mutex_unlock(&workqueue_mutex) even if
> > mutex_lock(&workqueue_mutex) hasn't been called...
>
> This is a known problem. Gautham had sent out patches to address them
>
> http://lkml.org/lkml/2006/11/14/93
>
> Looks like they are in latest mm tree. Perhaps the testcase should be
> retried against latest mm.
Ah, nice! Wasn't aware of that. But I still think we should have a
CPU_DOWN_FAILED in case CPU_DOWN_PREPARED failed.
Also the slab cache code hasn't been changed to make use of the of the
new CPU_LOCK_[ACQUIRE|RELEASE] stuff. I'm going to send patches in reply
to this mail.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists