lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 24 Aug 2009 21:30:16 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Frederic Weisbecker <fweisbec@...il.com>
cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	linux-kernel@...r.kernel.org, x86@...nel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Greg Kroah-Hartman <gregkh@...e.de>
Subject: Re: v2.6.31-rc6: BUG: unable to handle kernel NULL pointer dereference
 at 0000000000000008



On Mon, 24 Aug 2009, Linus Torvalds wrote:
> 
> Anyway, I'll happily be shown wrong. I think the (second) patch I sent out 
> is an acceptable hack in the presense of the current locking, but as I 
> said, I'm not exactly happy about it, because I do think the locking is 
> broken.

Btw, another solution to all this would be to just not have that 
ldisc_mutex deadlock due to do_tty_hangup -> tty_ldisc_hangup at all.

The actual _flushing_ doesn't need the mutex - it's just that both 
flushing and hangup is done with workqueues.

If we can avoid the deadlock by not having the (artificial) workqueue 
dependency, it would allow everybody to just hold on to the mutex over the 
whole sequence - and would obviate the need for that hacky 
TTY_LDISC_CHANGING bit thing in tty_set_ldisc.

In other words, the whole problem really comes in from the fact that 
do_tty_hangup() is called from "hangup_work", and the workqueues can get 
hung to the point where you can't then do the (totally _unrelated_) queue 
flushing.

Because flush_to_ldisc() itself - which is what we want to do - doesn't 
need that mutex or the workqueue at all. It could run from any context, 
afaik.

So if we were to turn it into just a timer (rather than a "delayed work"), 
then we'd not need to do that "flush_scheduled_work()" thing at all, and 
we wouldn't have that interaction with do_tty_hangup(). At which point we 
could again hold on to locks, because we wouldn't need to worry about the 
workqueues getting stuck on the mutex (that isn't even needed for the 
actual flushing part that we want to do!).

So don't get me wrong - there are _multiple_ ways to solve this. But they 
are all pretty major surgery, changing "big" semantics. We could fix the 
locking, we could change how we flush, we could do all of those things. 
And I'd love to. But I think the almost-oneliner is the safest approach 
right now. It's certainly not perfect, but it's fairly minimal impact.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ