linux-kernel - Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090126130046.37b8f34e.akpm@linux-foundation.org>
Date:	Mon, 26 Jan 2009 13:00:46 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	rusty@...tcorp.com.au, travis@....com, mingo@...hat.com,
	davej@...hat.com, cpufreq@...r.kernel.org,
	linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

On Mon, 26 Jan 2009 21:20:22 +0100
Ingo Molnar <mingo@...e.hu> wrote:

> 
> * Andrew Morton <akpm@...ux-foundation.org> wrote:
> 
> > On Mon, 26 Jan 2009 18:16:18 +0100
> > Ingo Molnar <mingo@...e.hu> wrote:
> > 
> > > 
> > > * Andrew Morton <akpm@...ux-foundation.org> wrote:
> > > 
> > > > > > Yet another kernel thread for each CPU.  All because of some dung 
> > > > > > way down in arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c.
> > > > > > 
> > > > > > Is there no other way?
> > > > > 
> > > > > Perhaps, but this works.  Trying to be clever got me into this mess in 
> > > > > the first place.
> > > > > 
> > > > > We could stop using workqueues and change work_on_cpu to create a 
> > > > > thread every time, which would give it a new failure mode so I don't 
> > > > > know that everyone could use it any more.  Or we could keep a single 
> > > > > thread around to do all the cpus, and duplicate much of the workqueue 
> > > > > code.
> > > > > 
> > > > > None of these options are appealing...
> > > > 
> > > > Can we try harder please?  10 screenfuls of kernel threads in the ps 
> > > > output is just irritating.
> > > > 
> > > > How about banning the use of work_on_cpu() from schedule_work() handlers 
> > > > and then fixing that driver somehow?
> > > 
> > > Yes, but that's fundamentally fragile: anyone who happens to stick the 
> > > wrong thing into keventd (and it's dead easy because schedule_work() is 
> > > easy to use) will lock up work_on_cpu() users.
> > > 
> > 
> > --- a/kernel/workqueue.c~a
> > +++ a/kernel/workqueue.c
> > @@ -998,6 +998,8 @@ long work_on_cpu(unsigned int cpu, long 
> >  {
> >  	struct work_for_cpu wfc;
> >  
> > +	BUG_ON(current_is_keventd());
> > +
> >  	INIT_WORK(&wfc.work, do_work_for_cpu);
> >  	wfc.fn = fn;
> >  	wfc.arg = arg;
> > _
> > 
> > 
> > That wasn't so hard.
> 
> What is the purpose of your change? I'm not sure you understood the 
> problem.

Well.  That's because I was forced to resort to guesswork.

> The problem is not with work_on_cpu() usage. The problem is:
> 
>  1) holding locks while calling work_on_cpu()
> 
>  2) same locks being taken by a worklet used by some other code
> 
> work_on_cpu() really wants to serialize on its own workload only, not on 
> the other stuff that might be sometimes be queued up in the keventd 
> workqueue.

but but but, we fixed that ages ago, I think.  But I don't see the code
there.

If we want to wait on a *particular* keventd work item then we
shouldn't wait on all the other queued ones.

- If it's currently running, wait on it

- If it isn't yet running, detach it from the queue and run it directly.

Maybe I'm thinking of a different subsystem, but I don't think so. 
Maybe Oleg recalls what happened to that?

> > > work_on_cpu() is an important (and lowlevel enough) facility to be 
> > > isolated from casual interaction like that.
> > 
> > We have one single (known) caller in the whole kernel.  This is not 
> > worth adding another great pile of kernel threads for!
> 
> i'd expect there to be more as part of the cpumask stack reduction 
> patches that Rusty and Mike are working on.
> 
> in any case it's a correctness issue: work_on_cpu() is a just as generic 
> facility as on_each_cpu() - with the difference that it can handle 
> blocking contexts too.

Well on_each_cpu() has restrictions.  Can't all it with local
interrupts disabled.  Can't call it (synchronously) while holding locks
which the callback takes.

> So if it's generic it ought to be implemented in a generic way - not a 
> "dont use from any codepath that has a lock held that might occasionally 
> also be held in a keventd worklet". (which is a totally unmaintainable 
> proposition and which would just cause repeat bugs again and again.)

That's different.  The core fault here lies in the keventd workqueue
handling code.  If we're flushing work A then we shouldn't go and block
behind unrelated work B.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/