lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D73E5C37-DC92-4D58-A163-0B20143AAEEB@amacapital.net>
Date:   Fri, 16 Mar 2018 09:01:47 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Christian Brauner <christian.brauner@...lbox.org>
Cc:     Andy Lutomirski <luto@...nel.org>, Tycho Andersen <tycho@...ho.ws>,
        Kees Cook <keescook@...omium.org>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Akihiro Suda <suda.akihiro@....ntt.co.jp>,
        LKML <linux-kernel@...r.kernel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Christian Brauner <christian.brauner@...onical.com>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Tyler Hicks <tyhicks@...onical.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Subject: Re: [RFC 0/3] seccomp trap to userspace



> On Mar 16, 2018, at 7:47 AM, Christian Brauner <christian.brauner@...lbox.org> wrote:
> 
>> On Fri, Mar 16, 2018 at 12:46:55AM +0000, Andy Lutomirski wrote:


I bet I confused everyone with a blatant typo:

>> 
>> Hmm, I think we have to be very careful to avoid nasty races.  I think
>> the correct approach is to notice the signal and send a message to the
>> listener that a signal is pending but to take no additional action.
>> If the handler ends up completing the syscall with a successful
>> return, we don't want to replace it with -EINTR.  IOW the code looks
>> kind of like:
>> 
>> send_to_listener("hey I got a signal");

That should be “hey I got a syscall”.   D’oh!

>> wait_ret = wait_interruptible for the listener to reply;
>> if (wait_ret == -EINTR) {
> 
> Hm, so from the pseudo-code it looks like: The handler would inform the
> listener that it received a signal (either from the syscall requester or
> from somewhere else) and then wait for the listener to reply to that
> message.  This would allow the listener to decide what action it wants
> the handler to take based on the signal, i.e. either cancel the request
> or retry?  The comment makes it sound like that the handler doesn't
> really wait on the listener when it receives a signal it simply moves
> on.

It keeps waiting killably but not interruptibly. 

> So no "taking no additional action" here means not have the handler
> decide to abort but the listener?

If by “handler” you mean kernel, then yes. 

There’s no userspace syscall handler involved. From the kernel’s perspective, a syscall is never still in progress when a signal handler is invoked — we only actually invoke syscall handlers in prepare_exit_to_usermode() or the non-x86 equivalent and the functions it calls. While a syscall is running, the kernel might notice that a signal is pending and do one of a few things:

1. Just keep going. Not all syscalls can be interrupted. 

2. Try to finish early. If a send() call has already sent some but not all data, it can stop waiting and return the number of bytes sent.

3. Abort with -EINTR.

4. Abort with -ERESTARTSYS or one of its relatives. These fiddle with user registers in a somewhat unpleasant way to pretend that the syscall never actually happened.  This works for syscalls that wait with an absolute timeout, for example. 

5. Set up restart_syscall() magic, rewrite regs so it looks like the user was about to call restart_syscall() when the signal happened, and abort. 

In all cases, the signal is dealt with afterwards. This could result in changing regs to call the handler or in simply returning. 

1-3 should work fully in seccomp. The only issue is that the kernel doesn’t know *which* to do, nor can the kernel force the listener to abort cleanly, so I think we have  no real choice but to let the listener decide. 

4 could be supported just like 1-3. 5 is awful, and I don’t think we should support it for user listeners. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ