linux-kernel - Re: Signal delivery order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200903160934.03700.mega@retes.hu>
Date:	Mon, 16 Mar 2009 09:34:03 +0100
From:	Gábor Melis <mega@...es.hu>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Signal delivery order

On Lunes 16 Marzo 2009, Oleg Nesterov wrote:
> On 03/15, Gábor Melis wrote:
> > On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> > > Now, since there are no more pending signals, we return to the
> > > user space, and start sig_2().
> >
> > I see. I guess in addition to changing the ip, the stack frobbing
> > magic arranges that sig_2 returns to sig_1 or some code that calls
> > sig_1.
>
> yes. "some code" == rt_sigreturn,
>
> > The revised signal-delivery-order.c (also attached) outputs:
> >
> > test_handler=8048727
> > sigsegv_handler=804872c
> > eip: 8048727
> > esp: b7d94cb8
> >
> > which shows that sigsegv_handler also has incorrect eip in the
> > context.
>
> Why do you think it is not correct?
>
> I didn't try your test-case, but I can't see where "esp: b7d94cb8"
> comes from. But "eip: 8048727" looks exactly right, this is the
> address of test_handler.

Sorry, I removed the printing of esp from the code as it was not 
relevant to my point but pasted the output of a previous run.

Anyway, I think eip is incorrect in sigsegv because it's not pointing to 
the instruction that caused the sigsegv. In general the ucontext is 
wrong, because it's as if sigsegv_handler were invoked within 
test_handler.

This is problematic if the sigsegv handler wants to do something with 
the context. The real life sigsegv handler that's been failing does 
this:
- skip the offending instruction by incrementing eip
- taking esp from the context, frob the control stack so that some 
function is called on return from the handler (the handler itself is on 
altstack). This is not unlike what the kernel does, it seems.

Now, "some function" cannot be called with SIGUSR1 blocked because that 
would potentially lead to deadlocks. (SIGUSR1 is sent to a thread when 
the garbage collector wants to stop it, and some function does 
allocations.)

So the context in the sigsegv handler pointing to the handler of SIGUSR1 
loses because it finds an unexpected sigmask: SIGUSR1 is blocked. It 
loses because the eip is not pointing to the right instruction, it 
loses because the SIGUSR1 handler won't finish until "some function" 
returns ...

It seems to me that the same problem could be triggerred by 
pthread_kill()ing a thread that's sigtrapping if the signum of the 
signal sent is lower than that of sigtrap, say it's SIGINT.

In a nutshell, the context argument is wrong.

> Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/