[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez2TcWb6SQ86XRJDdN-Ab_gO9-sXgpFnJODMXH60mCkBJQ@mail.gmail.com>
Date: Mon, 26 Oct 2020 10:32:03 +0100
From: Jann Horn <jannh@...gle.com>
To: "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
Cc: Tycho Andersen <tycho@...ho.pizza>,
Sargun Dhillon <sargun@...gun.me>,
Kees Cook <keescook@...omium.org>,
Christian Brauner <christian@...uner.io>,
linux-man <linux-man@...r.kernel.org>,
lkml <linux-kernel@...r.kernel.org>,
Aleksa Sarai <cyphar@...har.com>,
Alexei Starovoitov <ast@...nel.org>,
Will Drewry <wad@...omium.org>, bpf <bpf@...r.kernel.org>,
Song Liu <songliubraving@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Andy Lutomirski <luto@...capital.net>,
Linux Containers <containers@...ts.linux-foundation.org>,
Giuseppe Scrivano <gscrivan@...hat.com>,
Robert Sesek <rsesek@...gle.com>
Subject: Re: For review: seccomp_user_notif(2) manual page
On Sat, Oct 24, 2020 at 2:53 PM Michael Kerrisk (man-pages)
<mtk.manpages@...il.com> wrote:
> On 10/17/20 2:25 AM, Jann Horn wrote:
> > On Fri, Oct 16, 2020 at 8:29 PM Michael Kerrisk (man-pages)
> > <mtk.manpages@...il.com> wrote:
[...]
> >> I'm not sure if I should write anything about this small UAPI
> >> breakage in BUGS, or not. Your thoughts?
> >
> > Thinking about it a bit more: Any code that relies on pause() or
> > epoll_wait() not restarting is buggy anyway, right? Because a signal
> > could also arrive directly before entering the syscall, while
> > userspace code is still executing? So one could argue that we're just
> > enlarging a preexisting race. (Unless the signal handler checks the
> > interrupted register state to figure out whether we already entered
> > syscall handling?)
>
> Yes, that all makes sense.
>
> > If userspace relies on non-restarting behavior, it should be using
> > something like epoll_pwait(). And that stuff only unblocks signals
> > after we've already past the seccomp checks on entry.
>
> Thanks for elaborating that detail, since as soon as you talked
> about "enlarging a preexisting race" above, I immediately wondered
> sigsuspend(), pselect(), etc.
>
> (Mind you, I still wonder about the effect on system calls that
> are normally nonrestartable because they have timeouts. My
> understanding is that the kernel doesn't restart those system
> calls because it's impossible for the kernel to restart the call
> with the right timeout value. I wonder what happens when those
> system calls are restarted in the scenario we're discussing.)
Ah, that's an interesting edge case...
> Anyway, returning to your point... So, to be clear (and to
> quickly remind myself in case I one day reread this thread),
> there is not a problem with sigsuspend(), pselect(), ppoll(),
> and epoll_pwait() since:
>
> * Before the syscall, signals are blocked in the target.
> * Inside the syscall, signals are still blocked at the time
> the check is made for seccomp filters.
> * If a seccomp user-space notification event kicks, the target
> is put to sleep with the signals still blocked.
> * The signal will only get delivered after the supervisor either
> triggers a spoofed success/failure return in the target or the
> supervisor sends a CONTINUE response to the kernel telling it
> to execute the target's system call. Either way, there won't be
> any restarting of the target's system call (and the supervisor
> thus won't see multiple notifications).
>
> (Right?)
Yeah.
[...]
> > So we should probably document the restarting behavior as something
> > the supervisor has to deal with in the manpage; but for the
> > "non-restarting syscalls can restart from the target's perspective"
> > aspect, it might be enough to document this as quirky behavior that
> > can't actually break correct code? (Or not document it at all. Dunno.)
>
> So, I've added the following to the page:
>
> Interaction with SA_RESTART signal handlers
> Consider the following scenario:
>
> · The target process has used sigaction(2) to install a signal
> handler with the SA_RESTART flag.
>
> · The target has made a system call that triggered a seccomp user-
> space notification and the target is currently blocked until the
> supervisor sends a notification response.
>
> · A signal is delivered to the target and the signal handler is
> executed.
>
> · When (if) the supervisor attempts to send a notification
> response, the SECCOMP_IOCTL_NOTIF_SEND ioctl(2)) operation will
> fail with the ENOENT error.
>
> In this scenario, the kernel will restart the target's system
> call. Consequently, the supervisor will receive another user-
> space notification. Thus, depending on how many times the blocked
> system call is interrupted by a signal handler, the supervisor may
> receive multiple notifications for the same system call in the
> target.
>
> One oddity is that system call restarting as described in this
> scenario will occur even for the blocking system calls listed in
> signal(7) that would never normally be restarted by the SA_RESTART
> flag.
>
> Does that seem okay?
Sounds good to me.
> In addition, I've queued a cross-reference in signal(7):
>
> In certain circumstances, the seccomp(2) user-space notifi‐
> cation feature can lead to restarting of system calls that
> would otherwise never be restarted by SA_RESTART; for
> details, see seccomp_user_notif(2).
Powered by blists - more mailing lists