linux-kernel - Re: [git pull] x86 fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090126195910.GA13471@elte.hu>
Date:	Mon, 26 Jan 2009 20:59:10 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	tglx@...utronix.de, hpa@...or.com
Subject: Re: [git pull] x86 fixes

* Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Mon, 26 Jan 2009 20:20:16 +0100
> Ingo Molnar <mingo@...e.hu> wrote:
> 
> > 
> > * Andrew Morton <akpm@...ux-foundation.org> wrote:
> > 
> > > On Mon, 26 Jan 2009 18:17:23 +0100
> > > Ingo Molnar <mingo@...e.hu> wrote:
> > > 
> > > > Rusty Russell (2):
> > > >       ...
> > > >       work_on_cpu: Use our own workqueue.
> > > 
> > > wtf?
> > 
> > an x86 fix depends on it:
> > 
> > 7285908: cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
> > 
> 
> Right.  And these are currently under active (albeit rather slow) 
> discussion.
> 
> The changelogs suck, nobody can be assed actually telling us what the 
> bug is and the patches just casually toss yet another gaggle of kernel 
> threads into there.

One problem is that for example do_dbs_timer() [used both in the 
cpufreq_ondemand and cpufreq_conservative cpufreq drivers] gets queued 
into the generic kevent workqueue via schedule_work(). But work_on_cpu() 
needs to serialize on the worklet - i.e. it needs to do a flush_work() - 
and does this with the cpufreq lock held.

So we have a 'worklet inversion' bug here - and this got reported as a 
hard to debug boot hang on some systems.
The root cause is that kevents is not that do_dbs_timer() uses 
schedule_work() - the root cause is that kevents workqueue is a too 
generic workqueue that is the union of all casual workqueue users in the 
kernel. That is fine (and it is its purpose) but it should not be used for 
core kernel facilities such as work_on_cpu() - precisely because doing so 
would limit that facility's genericity.

This bug was always there but dormant until work_on_cpu() was used from a 
deep enough codepath.

So the solution here is to isolate work_on_cpu()s mechanisms from the 
'misc' workqueue that schedule_work() deals with - this is what Rusty's 
patch does.

Your observation about there being too many workqueue threads is correct 
but this commit is IMO a valid use of the workqueue facility. This 
workqueue it only gets created on CONFIG_SMP so there cost is about ~10K 
RAM per CPU.

I also agree that the commit should have included this description.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/