lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 May 2021 16:59:05 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Rodrigo Campos <rodrigo@...volk.io>
Cc:     Sargun Dhillon <sargun@...gun.me>,
        Andy Lutomirski <luto@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Mauricio Vásquez Bernal <mauricio@...volk.io>,
        Tycho Andersen <tycho@...ho.pizza>,
        Giuseppe Scrivano <gscrivan@...hat.com>,
        Christian Brauner <christian.brauner@...ntu.com>
Subject: Re: [PATCH v2 2/5] seccomp: Add wait_killable semantic to seccomp
 user notifier

On Thu, May 6, 2021 at 12:35 PM Rodrigo Campos <rodrigo@...volk.io> wrote:

> We want to be woken-up in seccomp_do_user_notification() so we know we
> received a signal and notify to the supervisor about it. IMHO, we want
> an additional "if" here[2], like (haven't checked the retun codes of
> functions) "if err != 0"  and do a "wake_up_poll(&match->wqh, EPOLLIN
> | EPOLLRDNORM);" or POLLPRI or something, and then "goto wait". IOW,
> we only return when the supervisor tells us to, if a signal arrives we
> just let the supervisor know and continue waiting for a response. The
> proper response might be "I wrote partial data, return only 3 bytes",
> etc. This way we can properly emulate it.
>
> What is not clear to me, and I'd like others to comment is:
>
> 1. How should this new "the supervisor should handle signals" mode be enabled?
>    a. If we use an extra ioctl, as Andy suggested, I think we have a
> race too: from the moment the seccomp syscall is issued until the new
> ioctl is called there is a race. If the container does a syscall in
> that window, it _can_ incorrectly return EINTR as it does now. This
> race can be very small and the ioctl can make _all the next syscalls_
> wait in this new mode, so maybe the race is so small that we don't
> care in practice.

If the ioctl is sticky, it can presumably avoid the race entirely by
having whoever calls seccomp() immediately do a dummy syscall or
otherwise wait for the notifier to confirm that it has done the ioctl.
Admittedly, this may be annoying.

>    b. But if we want to really fully solve the issue, the other option
> that comes to mind is adding a new flag for the seccomp syscall, to
> signal that the supervisor will handle signals and should be
> forwarded. This way, if the container does a new syscall it will be
> put to wait in this new mode (we have the information that the new
> mode should be used already). Something like
> SECCOMP_FILTER_FLAG_NEW_LISTENER_SOMETHING (today we have
> SECCOMP_FILTER_FLAG_NEW_LISTENER). Ideally shorter :D

I like that better.

>
> 2. What should we notify to the supervisor? Only that a signal arrived
> or also which signal it was? If we do the EPOLLPRI thingy, as Andy
> mentioned in a previous thread, IIUC we will only notify that _a_
> signal arrived, but not _which_. To notify which signal arrived might
> be complex, though, (not sure, I haven't explored how that should be).

Are there any examples in the kernel of syscalls that care *which*
signal came in?  As far as I know, there are really only three states
that matter: no signal is pending, a kill is pending, or a regular
signal is pending.  (The latter means that there's a signal that is
unmasked, etc and will should therefore interrupt a syscall if the
syscall can be interrupted.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ