[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <43039bb6-9d9f-b347-fa92-ea34ccc21d3d@rasmusvillemoes.dk>
Date: Thu, 24 Sep 2020 15:40:54 +0200
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Kees Cook <keescook@...omium.org>,
YiFei Zhu <yifeifz2@...inois.edu>
Cc: Jann Horn <jannh@...gle.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Tycho Andersen <tycho@...ho.pizza>,
Andy Lutomirski <luto@...capital.net>,
Will Drewry <wad@...omium.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Giuseppe Scrivano <gscrivan@...hat.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Dimitrios Skarlatos <dskarlat@...cmu.edu>,
Valentin Rothberg <vrothber@...hat.com>,
Hubertus Franke <frankeh@...ibm.com>,
Jack Chen <jianyan2@...inois.edu>,
Josep Torrellas <torrella@...inois.edu>,
Tianyin Xu <tyxu@...inois.edu>, bpf@...r.kernel.org,
containers@...ts.linux-foundation.org, linux-api@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 0/6] seccomp: Implement constant action bitmaps
On 24/09/2020 01.29, Kees Cook wrote:
> rfc: https://lore.kernel.org/lkml/20200616074934.1600036-1-keescook@chromium.org/
> alternative: https://lore.kernel.org/containers/cover.1600661418.git.yifeifz2@illinois.edu/
> v1:
> - rebase to for-next/seccomp
> - finish X86_X32 support for both pinning and bitmaps
> - replace TLB magic with Jann's emulator
> - add JSET insn
>
> TODO:
> - add ALU|AND insn
> - significantly more testing
>
> Hi,
>
> This is a refresh of my earlier constant action bitmap series. It looks
> like the RFC was missed on the container list, so I've CCed it now. :)
> I'd like to work from this series, as it handles the multi-architecture
> stuff.
So, I agree with Jann's point that the only thing that matters is that
always-allowed syscalls are indeed allowed fast.
But one thing I'm wondering about and I haven't seen addressed anywhere:
Why build the bitmap on the kernel side (with all the complexity of
having to emulate the filter for all syscalls)? Why can't userspace just
hand the kernel "here's a new filter: the syscalls in this bitmap are
always allowed noquestionsasked, for the rest, run this bpf". Sure, that
might require a new syscall or extending seccomp(2) somewhat, but isn't
that a _lot_ simpler? It would probably also mean that the bpf we do get
handed is a lot smaller. Userspace might need to pass a couple of
bitmaps, one for each relevant arch, but you get the overall idea.
I'm also a bit worried about the performance of doing that emulation;
that's constant extra overhead for, say, launching a docker container.
Regardless of how the kernel's bitmap gets created, something like
+ if (nr < NR_syscalls) {
+ if (test_bit(nr, bitmaps->allow)) {
+ *filter_ret = SECCOMP_RET_ALLOW;
+ return true;
+ }
probably wants some nospec protection somewhere to avoid the irony of
seccomp() being used actively by bad guys.
Rasmus
Powered by blists - more mailing lists