lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.01.0908241632060.3824@localhost.localdomain>
Date:	Mon, 24 Aug 2009 16:51:03 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
cc:	linux-kernel@...r.kernel.org, x86@...nel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Greg Kroah-Hartman <gregkh@...e.de>
Subject: Re: v2.6.31-rc6: BUG: unable to handle kernel NULL pointer dereference
 at 0000000000000008



On Mon, 24 Aug 2009, Linus Torvalds wrote:
> 
> Untested. VERY untested. Just going by "that looks odd".

Btw, one issue here is that we at least sometimes do tty_ldisc_halt() 
under the tty->ldisc_mutex.  Now that's fine - as long as we never take 
that lock inside any delayed work - because then the delayed work itself 
may need the lock we hold in order to complete, and now the 
'cancel_delayed_work_sync()' thing might deadlock.

And sadly, we do end up having 'do_tty_hangup()' as a workqueue entry, and 
that one does tty_ldisc_hangp, and that one in turn does take 
tty->ldisc_mutex.

So it looks like either we can't use the 'sync()' version, or we should 
never hold the ldisc_mutex while doing that tty_ldisc_halt(). Because 
waiting for the workqueue while holding the mutex looks like it could 
deadlock. It's probably very rare, but whatever.

Still, it would be good for people to test whether that patch makes the 
problem go away. Just to see if the issue really is a race between 
"tty_ldisc_halt()" and an ldisc being active on another CPU right then. 

But I wanted to let people know that the patch is clearly not the "last 
word" on this. It's a useful thing to try, but we need something better.

And it looks like we've hit that problem before, which is probably why it 
didn't use sync. several of the callers of 'tty_ldisc_halt()' do a 
flush_scheduled_work() afterwards, outside the ldisc_mutex. Of course, the 
sane one (tty_ldisc_release()) does a tty_ldisc_halt() even before taking 
the mutex lock.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ