lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0807101908330.2936@woody.linux-foundation.org>
Date:	Thu, 10 Jul 2008 19:22:54 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Roland McGrath <roland@...hat.com>
cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86_64: fix delayed signals



On Thu, 10 Jul 2008, Linus Torvalds wrote:
> 
> So now I'm considering just putting it in before the 2.6.26 release after 
> all ;)

.. and having looked at the code, and thought about it some more, I'm 
definitely off the patch again.

The reason is actually exactly the same bug that showed up when you did 
this for x86-32 three years ago, and that may in fact still be lurking.

The endless loop of "call do_notify_resume until all the work flags are 
zero" is very fragile: it will immediately cause a hard lockup if there is 
some circumstance where do_notify_resume will not clear the flag.

And when it comes to signals, there are several cases that can cause 
TIF_SIGPENDING to not be cleared:

 - confusion about user/kernel mode, where "do_signal()" will return 
   without doing anything at all if we're in user mode.

   This was the bug we hit back in 2005 with a out-of-tree kernel-based 
   vm86 model (which hopefully has since died a painful death).

 - get_signal_to_deliver() returning and not handling the signal. 
   dequeue_signal() will do this for that collect_signal() case and for 
   the whole DRI notifier thing. The DRI notifier() case actually clears 
   TIF_SIGPENDING, but then we do "recalc_sigpending()" in the caller, so 
   it might get set again.

   I do hate that code (I know you do too), and the code _should_ block 
   the signal that gets ignored (so recalc_sigpending() should keep it 
   cleared), but it's not entirely obvious. Maybe it gets into an endless 
   loop of calling the notifier if this case ever triggers?

 - recalc_sigpending() expressly does not clear the TIF_SIGPENDING flag if 
   we hit the "freezing(current)" case. So TIF_SIGPENDING stays set for 
   freezing() processes. I think (and *hope*) they all get caught by other 
   means anyway in that whole do_notify_resume() loop, but this is another 
   of those "the freezer code is insane, I'm not going to try to think it 
   through" cases.

In short, I think your patch is fine now, but I'm also nervous enough 
about it that I'm not going to apply it. Any bugs it could expose look 
very unlikely, and if they exist they are probably bugs on 32-bit as we 
speak, but call me a worry-wart.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ