[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090126103529.cb124a58.akpm@linux-foundation.org>
Date: Mon, 26 Jan 2009 10:35:29 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: rusty@...tcorp.com.au, travis@....com, mingo@...hat.com,
davej@...hat.com, cpufreq@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.
On Mon, 26 Jan 2009 18:16:18 +0100
Ingo Molnar <mingo@...e.hu> wrote:
>
> * Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> > > > Yet another kernel thread for each CPU. All because of some dung
> > > > way down in arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c.
> > > >
> > > > Is there no other way?
> > >
> > > Perhaps, but this works. Trying to be clever got me into this mess in
> > > the first place.
> > >
> > > We could stop using workqueues and change work_on_cpu to create a
> > > thread every time, which would give it a new failure mode so I don't
> > > know that everyone could use it any more. Or we could keep a single
> > > thread around to do all the cpus, and duplicate much of the workqueue
> > > code.
> > >
> > > None of these options are appealing...
> >
> > Can we try harder please? 10 screenfuls of kernel threads in the ps
> > output is just irritating.
> >
> > How about banning the use of work_on_cpu() from schedule_work() handlers
> > and then fixing that driver somehow?
>
> Yes, but that's fundamentally fragile: anyone who happens to stick the
> wrong thing into keventd (and it's dead easy because schedule_work() is
> easy to use) will lock up work_on_cpu() users.
>
--- a/kernel/workqueue.c~a
+++ a/kernel/workqueue.c
@@ -998,6 +998,8 @@ long work_on_cpu(unsigned int cpu, long
{
struct work_for_cpu wfc;
+ BUG_ON(current_is_keventd());
+
INIT_WORK(&wfc.work, do_work_for_cpu);
wfc.fn = fn;
wfc.arg = arg;
_
That wasn't so hard.
> work_on_cpu() is an important (and lowlevel enough) facility to be
> isolated from casual interaction like that.
We have one single (known) caller in the whole kernel. This is not
worth adding another great pile of kernel threads for!
> > What _is_ the bug anyway? The only description we were given was
> >
> > Impact: remove potential clashes with generic kevent workqueue
> >
> > Annoyingly, some places we want to use work_on_cpu are already in
> > workqueues. As per Ingo's suggestion, we create a different
> > workqueue for work_on_cpu.
> >
> > which didn't bother telling anyone squat.
> >
> > When was this bug added? Was it added into that driver or was it due to
> > infrastructural changes?
>
> This fixes lockups during bootup caused by the cpumask changes/cleanups
> which changed set_cpus_allowed()+on-kernel-stack-cpumask_t to
> work_on_cpu().
>
> Which was fine except it didnt take into account the interaction with the
> kevents workqueue and the very wide cross section for worklet dependencies
> that this brings with itself. work_on_cpu() was rarely used before so this
> didnt show up.
Am still awaiting an understandable description of this alleged bug :(
If someone can be persuaded to cough up this information (which should
have been in the changelog from day #1) then perhaps someone will be
able to think of a more pleasing fix. That's one of the reasons for
writing changelogs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists