lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Jun 2013 23:03:33 +0100
From:	James Hogan <james.hogan@...tec.com>
To:	Denys Vlasenko <vda.linux@...glemail.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Ralf Baechle <ralf@...ux-mips.org>,
	David Daney <david.daney@...ium.com>,
	Oleg Nesterov <oleg@...hat.com>, linux-mips@...ux-mips.org,
	Al Viro <viro@...iv.linux.org.uk>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	David Howells <dhowells@...hat.com>,
	Dave Jones <davej@...hat.com>
Subject: Re: [PATCH v2] MIPS: Reduce _NSIG from 128 to 127 to avoid BUG_ON

On 28 June 2013 20:28, Denys Vlasenko <vda.linux@...glemail.com> wrote:
> On Monday 17 June 2013 12:36, James Hogan wrote:
>> On 14/06/13 17:03, James Hogan wrote:
>> > MIPS has 128 signals, the highest of which has the number 128 (they
>> > start from 1). The following command causes get_signal_to_deliver() to
>> > pass this signal number straight through to do_group_exit() as the exit
>> > code:
>> >
>> >   strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
>> >
>> > However do_group_exit() checks for the core dump bit (0x80) in the exit
>> > code which matches in this particular case and the kernel panics:
>> >
>> >   BUG_ON(exit_code & 0x80); /* core dumps don't get here */
>> >
>> > Lets avoid this by changing the ABI by reducing the number of signals to
>> > 127 (so that the maximum signal number is 127). Glibc incorrectly sets
>> > [__]SIGRTMAX to 127 already. uClibc sets it to 128 so it's conceivable
>> > that programs built against uClibc which intentionally uses RT signals
>> > from the top (SIGRTMAX-n, n>=0) would need an updated uClibc (and a
>> > rebuild if it's crazy enough to use __SIGRTMAX).
>>
>> Hmm, although this works around the BUG_ON, this doesn't actually seem
>> to be sufficient to behave correctly.
>>
>> So it appears the exit status is constructed like this:
>> bits  purpose
>> 0x007f        signal number (0-127)
>> 0x0080        core dump
>> 0xff00        exit status
>>
>> but the macros in waitstatus.h and wait.h in libc
>> (see also "man 2 wait"):
>> WIFEXITED:   status & 0x7f == 0
>> WIFSIGNALED: status & 0x7f in [1..126] (i.e. not 0 or 127)
>> WIFSTOPPED:  status & 0xff == 127
>>
>> So termination due to SIG127 looks like it's been stopped instead of
>> terminated via a signal, unless a core dump occurs in which case none of
>> the above match.
>>
>> (And termination due to SIG128 hits BUG_ON, otherwise would appear to
>> have exited normally with core dump).
>>
>>
>> Reducing number of signals to 126 to avoid this will change the glibc
>> ABI too, in which case we may as well reduce to 64 to match other
>> arches, which is more likely to break something (I'm not really
>> comfortable making that change).
>>
>> Reducing to 127 (this patch) still leaves incorrect exit status codes
>> for SIG127 ...
>>
>> Any further thoughts/opinions?
>
> Strictly speaking, exit status of 0x007f isn't ambiguous.
>
> Currently userspace uses the following rules
> (assuming that status is 16-bit (IOW, dropping PTRACE_EVENT bits)):
>
> WIFEXITED(status)    = (status & 0x7f) == 0
> WIFSIGNALED(status)  = (status & 0x7f) != 0 && (status & 0x7f) < 0x7f
> WIFSTOPPED(status)   = (status & 0xff) == 0x7f
> WIFCONTINUED(status) = (status == 0xffff)
>
> WEXITSTATUS(status)  = status >> 8
> WSTOPSIG(status)     = status >> 8
> WCOREDUMP(status)    = status & 0x80
> WTERMSIG(status)     = status & 0x7f
>
> When process dies from signal 127, status is 0x007f and it is not a valid
> "stopped by signal" indicator, since WSTOPSIG == 0 is an impossibility.
>
> Status 0x007f get misinterpreted by the rules above, namely,
> WIFSTOPPED is true, WIFSIGNALED is false.
>
> But an alternative definition exists which works correctly with
> all previous status codes, treats 0x007f as "killed by signal 127"
> and isn't more convoluted.
> In fact, while WIFSTOPPED needs one additional check,
> WIFSIGNALED gets simpler (loses one AND'ing operation):
>
> WIFSTOPPED(status)   = (status & 0xff) == 0x7f && (status >> 8) != 0
> WIFSIGNALED(status)  = status != 0 && status <= 0xff
>
> All other rules need no change.
>
> I think it's feasible to ask {g,uc}libc to change their defines
> (on MIPS as a minimum), and live with 127 signals.

Thanks for the explanation. This makes a lot of sense and if I
understand correctly it already describes the current behaviour of the
kernel up to SIG127 (I hadn't twigged WIFSTOPPED should imply
WSTOPSIG!=0 for some reason). I like it.

Cheers
James
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ