[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090126121426.32f391a9.akpm@linux-foundation.org>
Date: Mon, 26 Jan 2009 12:14:26 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
tglx@...utronix.de, hpa@...or.com,
Rusty Russell <rusty@...tcorp.com.au>,
Mike Travis <travis@....com>
Subject: Re: [git pull] x86 fixes
On Mon, 26 Jan 2009 20:59:10 +0100
Ingo Molnar <mingo@...e.hu> wrote:
> * Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> > On Mon, 26 Jan 2009 20:20:16 +0100
> > Ingo Molnar <mingo@...e.hu> wrote:
> >
> > >
> > > * Andrew Morton <akpm@...ux-foundation.org> wrote:
> > >
> > > > On Mon, 26 Jan 2009 18:17:23 +0100
> > > > Ingo Molnar <mingo@...e.hu> wrote:
> > > >
> > > > > Rusty Russell (2):
> > > > > ...
> > > > > work_on_cpu: Use our own workqueue.
> > > >
> > > > wtf?
> > >
> > > an x86 fix depends on it:
> > >
> > > 7285908: cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
> > >
> >
> > Right. And these are currently under active (albeit rather slow)
> > discussion.
> >
> > The changelogs suck, nobody can be assed actually telling us what the
> > bug is and the patches just casually toss yet another gaggle of kernel
> > threads into there.
>
> One problem is that for example do_dbs_timer() [used both in the
> cpufreq_ondemand and cpufreq_conservative cpufreq drivers] gets queued
> into the generic kevent workqueue via schedule_work(). But work_on_cpu()
> needs to serialize on the worklet - i.e. it needs to do a flush_work() -
> and does this with the cpufreq lock held.
We've discovered many times that doing a workqueue flush with a lock
held is a bad idea. If any callback takes that lock, or takes some
other lock which someone else takes while holding the original lock,
etc, we deadlock.
> So we have a 'worklet inversion' bug here - and this got reported as a
> hard to debug boot hang on some systems.
> The root cause is that kevents is not that do_dbs_timer() uses
> schedule_work() - the root cause is that kevents workqueue is a too
> generic workqueue that is the union of all casual workqueue users in the
> kernel. That is fine (and it is its purpose) but it should not be used for
> core kernel facilities such as work_on_cpu() - precisely because doing so
> would limit that facility's genericity.
Yes, that's a point.
> This bug was always there but dormant until work_on_cpu() was used from a
> deep enough codepath.
>
> So the solution here is to isolate work_on_cpu()s mechanisms from the
> 'misc' workqueue that schedule_work() deals with - this is what Rusty's
> patch does.
But it's still deadlockable, isn't it? If you do a work_on_cpu() while
holding a lock, and the callback function takes that lock, deadlock?
If so, things didn't get much better?
> Your observation about there being too many workqueue threads is correct
> but this commit is IMO a valid use of the workqueue facility. This
> workqueue it only gets created on CONFIG_SMP so there cost is about ~10K
> RAM per CPU.
mm.. This is why I'd like to expend a little more effort trying to
avoid having to make this change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists