[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWH0vLZwkOV6JCnC-pLaHe0dboXs5-ETkXFCR9X-aq7qw@mail.gmail.com>
Date: Mon, 14 Sep 2015 10:52:46 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Tycho Andersen <tycho.andersen@...onical.com>
Cc: Kees Cook <keescook@...omium.org>,
Pavel Emelyanov <xemul@...allels.com>,
Network Development <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Oleg Nesterov <oleg@...hat.com>,
"Serge E. Hallyn" <serge.hallyn@...ntu.com>,
Linux API <linux-api@...r.kernel.org>,
Will Drewry <wad@...omium.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: v2 of seccomp filter c/r patches
On Sep 11, 2015 10:28 AM, "Tycho Andersen" <tycho.andersen@...onical.com> wrote:
>
> On Fri, Sep 11, 2015 at 10:00:22AM -0700, Andy Lutomirski wrote:
> > On Fri, Sep 11, 2015 at 9:30 AM, Andy Lutomirski <luto@...capital.net> wrote:
> > > On Sep 10, 2015 5:22 PM, "Tycho Andersen" <tycho.andersen@...onical.com> wrote:
> > >>
> > >> Hi all,
> > >>
> > >> Here is v2 of the seccomp filter c/r set. The patch notes have individual
> > >> changes from the last series, but there are two points not noted:
> > >>
> > >> * The series still does not allow us to correctly restore state for programs
> > >> that will use SECCOMP_FILTER_FLAG_TSYNC in the future. Given that we want to
> > >> keep seccomp_filter's identity, I think something along the lines of another
> > >> seccomp command like SECCOMP_INHERIT_PARENT is needed (although I'm not sure
> > >> if this can even be done yet). In addition, we'll need a kcmp command for
> > >> figuring out if filters are the same, although this too needs to compare
> > >> seccomp_filter objects, so it's a little screwy. Any thoughts on how to do
> > >> this nicely are welcome.
> > >
> > > Let's add a concept of a seccompfd.
> > >
> > > For background of what I want to add: I want to be able to create a
> > > seccomp monitor. A seccomp monitor will be, logically, a pair of a
> > > struct file that represents the monitor and a seccomp_filter that is
> > > controlled by the monitor. Depending on flags, whoever holds the
> > > monitor fd could change the active filter, intercept syscalls, and
> > > issue syscalls on behalf of a process that is trapped in an
> > > intercepted syscall.
> > >
> > > Seccomp filters would nest properly.
> > >
> > > The interface would probably be (extremely pseudocoded):
> > >
> > > monitor_fd, filter_fd = seccomp(CREATE_MONITOR, flags, ...);
> > >
> > > Then, later:
> > >
> > > seccomp(ATTACH_TO_FILTER, filter_fd); /* now filtered */
> > >
> > > read(monitor_fd, buf, size); /* returns an intercepted syscall */
> > > write(monitor_fd, buf, size); /* issues a syscall or releases the
> > > trapped task */
> > >
> > > This can't be implemented on x86 without either going insane or
> > > finishing the massive set of pending cleanups to the x86 entry code.
> > > I favor the latter.
> > >
> > > We could, however, add part of it right now: we could have a way to
> > > create a filterfd, we could add kcmp support for it, and we could add
> > > the ATTACH_TO_FILTER thing. I think that would solve your problem.
> > >
> > > One major open question: does a filter_fd know what its parent is and,
> > > if so, will it just refuse to attach if the caller's parent is wrong?
> > > Or will a filter_fd attach anywhere.
> > >
> >
> > Let me add one more thought:
> >
> > Currently, struct seccomp_filter encodes a strict tree hierarchy: it
> > knows what its parent is. This only matters as an implementation
> > detail and because TSYNC checks for seccomp_filter equality.
> >
> > We could change this without user-visible effects. We could say that,
> > for TSYNC purposes, two filter states match if they contain exactly
> > the same layers in the same order where a layer does *not* encode a
> > concept of parent. We could then say that attaching a classic bpf
> > filter creates a branch new layer that is not equal to any other layer
> > that's been created.
> >
> > This has no effect whatsoever. The difference would be that we could
> > declare that attaching the same ebpf program twice creates the *same*
> > layer so that, if you fork and both children attach the same ebpf
> > program, then they match for TSYNC purposes.
>
> Would you keep struct seccomp_filter identity here (meaning that you'd
> reach over and grab the seccomp_filter from a sibling thread if it
> existed)? Would it only work for the last filter attached to siblings,
> or for all the filters? This does make my life easier, but I like the
> idea of just using seccompfd directly below as it seems somewhat
> easier (for me at least) to understand,
>
If we did that, it would just be an internal optimization.
> > Similarly, attaching the
> > same hypothetical filterfd would create the same layer.
>
> If we change the api of my current set to have the ptrace commands
> iterate over seccomp fds, it looks something like:
>
> seccompfd = ptrace(GET_FILTER_FD, pid);
> while (ptrace(NEXT_FD, pid, seccompfd) == 0) {
> if (seccomp(CHECK_INHERITED, seccompfd))
> break;
>
> bpffd = seccomp(GET_BPF_FD, seccompfd);
> err = buf(BPF_PROG_DUMP, bpffd, &attr);
> /* save the bpf prog */
> }
>
> then restore can look like:
>
> while (have_noninherited_filters()) {
> filter = load_filter();
> bpffd = bpf(BPF_PROG_LOAD, filter);
> seccompfd = seccomp(SECCOMP_FD_CREATE, bpffd);
>
> filters[n_filters++] = seccompfd;
> }
>
> /* fork any children as necessary and do the rest of the restore */
>
> for (i = 0; i < n_filters; i++) {
> seccomp(SECCOMP_FD_INSTALL, filters[i]);
> }
>
> then the only question is how to implement the CHECK_INHERITED command
> on dump.
I don't think it would be a well defined operation. I think you'd
have to ask "for this pid, give me the nth thing in the stack", since
an fd identifying a layer without reference to its parent would no
longer even be guaranteed to be unique in the filter stack for a given
task.
I'm not sure I entirely like this solution...
>
> If we support the above API, we don't need to think about the concept
> of layers at all, or do any extra work on filter install to preserve
> struct seccomp_filter identity, it just comes naturally.
>
> Tycho
>
> > Thoughts?
> >
> > --Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists