linux-kernel - Re: [git pull] x86 fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090126121426.32f391a9.akpm@linux-foundation.org>
Date:	Mon, 26 Jan 2009 12:14:26 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	tglx@...utronix.de, hpa@...or.com,
	Rusty Russell <rusty@...tcorp.com.au>,
	Mike Travis <travis@....com>
Subject: Re: [git pull] x86 fixes

On Mon, 26 Jan 2009 20:59:10 +0100
Ingo Molnar <mingo@...e.hu> wrote:

> * Andrew Morton <akpm@...ux-foundation.org> wrote:
> 
> > On Mon, 26 Jan 2009 20:20:16 +0100
> > Ingo Molnar <mingo@...e.hu> wrote:
> > 
> > > 
> > > * Andrew Morton <akpm@...ux-foundation.org> wrote:
> > > 
> > > > On Mon, 26 Jan 2009 18:17:23 +0100
> > > > Ingo Molnar <mingo@...e.hu> wrote:
> > > > 
> > > > > Rusty Russell (2):
> > > > >       ...
> > > > >       work_on_cpu: Use our own workqueue.
> > > > 
> > > > wtf?
> > > 
> > > an x86 fix depends on it:
> > > 
> > > 7285908: cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
> > > 
> > 
> > Right.  And these are currently under active (albeit rather slow) 
> > discussion.
> > 
> > The changelogs suck, nobody can be assed actually telling us what the 
> > bug is and the patches just casually toss yet another gaggle of kernel 
> > threads into there.
> 
> One problem is that for example do_dbs_timer() [used both in the 
> cpufreq_ondemand and cpufreq_conservative cpufreq drivers] gets queued 
> into the generic kevent workqueue via schedule_work(). But work_on_cpu() 
> needs to serialize on the worklet - i.e. it needs to do a flush_work() - 
> and does this with the cpufreq lock held.

We've discovered many times that doing a workqueue flush with a lock
held is a bad idea.  If any callback takes that lock, or takes some
other lock which someone else takes while holding the original lock,
etc, we deadlock.

> So we have a 'worklet inversion' bug here - and this got reported as a 
> hard to debug boot hang on some systems.
> The root cause is that kevents is not that do_dbs_timer() uses 
> schedule_work() - the root cause is that kevents workqueue is a too 
> generic workqueue that is the union of all casual workqueue users in the 
> kernel. That is fine (and it is its purpose) but it should not be used for 
> core kernel facilities such as work_on_cpu() - precisely because doing so 
> would limit that facility's genericity.

Yes, that's a point.

> This bug was always there but dormant until work_on_cpu() was used from a 
> deep enough codepath.
> 
> So the solution here is to isolate work_on_cpu()s mechanisms from the 
> 'misc' workqueue that schedule_work() deals with - this is what Rusty's 
> patch does.

But it's still deadlockable, isn't it?  If you do a work_on_cpu() while
holding a lock, and the callback function takes that lock, deadlock? 
If so, things didn't get much better?

> Your observation about there being too many workqueue threads is correct 
> but this commit is IMO a valid use of the workqueue facility. This 
> workqueue it only gets created on CONFIG_SMP so there cost is about ~10K 
> RAM per CPU.

mm..  This is why I'd like to expend a little more effort trying to
avoid having to make this change.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/