[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWH7HbY2gS6O_cYKfp9QqqWBWVcHb++GaP3uUiSO9oo6g@mail.gmail.com>
Date: Fri, 16 Mar 2018 00:46:55 +0000
From: Andy Lutomirski <luto@...nel.org>
To: Tycho Andersen <tycho@...ho.ws>
Cc: Andy Lutomirski <luto@...nel.org>,
"Serge E. Hallyn" <serge@...lyn.com>,
Christian Brauner <christian.brauner@...onical.com>,
LKML <linux-kernel@...r.kernel.org>,
Linux Containers <containers@...ts.linux-foundation.org>,
Kees Cook <keescook@...omium.org>,
Oleg Nesterov <oleg@...hat.com>,
"Eric W . Biederman" <ebiederm@...ssion.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Tyler Hicks <tyhicks@...onical.com>,
Akihiro Suda <suda.akihiro@....ntt.co.jp>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Subject: Re: [RFC 0/3] seccomp trap to userspace
On Thu, Mar 15, 2018 at 5:35 PM, Tycho Andersen <tycho@...ho.ws> wrote:
> Hi Andy,
>
> On Thu, Mar 15, 2018 at 05:11:32PM +0000, Andy Lutomirski wrote:
>> On Thu, Mar 15, 2018 at 5:05 PM, Serge E. Hallyn <serge@...lyn.com> wrote:
>> > Hm, synchronously - that brings to mind a thought... I should re-look at
>> > Tycho's patches first, but, if I'm in a container, start some syscall that
>> > gets trapped to userspace, then I hit ctrl-c. I'd like to be able to have
>> > the handler be interrupted and have it return -EINTR. Is that going to
>> > be possible with the synchronous approach?
>>
>> I think so, but it should be possible with the classic async approach
>> too. The main issue is the difference between a classic filter like
>> this (pseudocode):
>>
>> if (nr == SYS_mount) return TRAP_TO_USERSPACE;
>>
>> and the eBPF variant:
>>
>> if (nr == SYS_mount) trap_to_userspace();
>
> Sargun started a private design discussion thread that I don't think
> you were on, but Alexei said something to the effect of "eBPF programs
> will never wait on userspace", so I'm not sure we can do something
> like this in an eBPF program. I'm cc-ing him here again to confirm,
> but I doubt things have changed.
>
>> I admit that it's still not 100% clear to me that the latter is
>> genuinely more useful than the former.
>>
>> The case where I think the synchronous function call is a huge win is this one:
>>
>> if (nr == SYS_mount) {
>> log("Someone called mount with args %lx\n", ...);
>> return RET_KILL;
>> }
>>
>> The idea being that the log message wouldn't show up in the kernel log
>> -- it would get sent to the listener socket belonging to whoever
>> created the filter, and that process could then go and log it
>> properly. This would work perfectly in containers and in totally
>> unprivileged applications like Chromium.
>
> The current implementation can't do exactly this, but you could do:
>
> if (nr == SYS_mount) {
> log(...);
> kill(pid, SIGKILL);
> }
>
> from the handler instead.
>
> I guess Serge is asking a slightly different question: what if the
> task gets e.g. SIGINT from the user doing a ^C or SIGALARM or
> something, we should probably send the handler some sort of message or
> interrupt to let it know that the syscall was cancelled. Right now the
> current set doesn't behave that way, and the handler will just
> continue on its merry way and get an EINVAL when it tries to respond
> with the cancelled cookie.
Hmm, I think we have to be very careful to avoid nasty races. I think
the correct approach is to notice the signal and send a message to the
listener that a signal is pending but to take no additional action.
If the handler ends up completing the syscall with a successful
return, we don't want to replace it with -EINTR. IOW the code looks
kind of like:
send_to_listener("hey I got a signal");
wait_ret = wait_interruptible for the listener to reply;
if (wait_ret == -EINTR) {
send_to_listener("hey there's a signal");
wait_ret = wait_killable for the listener to reply to the original request;
}
if (wait_ret == -EINTR) {
/* hmm, this next line might not actually be necessary, but it's
harmless and possibly useful */
send_to_listener("hey we're going away");
/* and stop waiting */
}
... actually handle the result.
--Andy
Powered by blists - more mailing lists