linux-kernel - Re: [PATCH v1 0/6] seccomp: Implement constant action bitmaps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABqSeAQKksqM1SdsQMoR52AJ5CY0VE2tk8-TJaMuOrkCprQ0MQ@mail.gmail.com>
Date:   Thu, 24 Sep 2020 08:58:53 -0500
From:   YiFei Zhu <zhuyifei1999@...il.com>
To:     Rasmus Villemoes <linux@...musvillemoes.dk>
Cc:     Kees Cook <keescook@...omium.org>,
        YiFei Zhu <yifeifz2@...inois.edu>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Giuseppe Scrivano <gscrivan@...hat.com>,
        Will Drewry <wad@...omium.org>, bpf <bpf@...r.kernel.org>,
        Jann Horn <jannh@...gle.com>,
        Linux API <linux-api@...r.kernel.org>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Hubertus Franke <frankeh@...ibm.com>,
        Andy Lutomirski <luto@...capital.net>,
        Valentin Rothberg <vrothber@...hat.com>,
        Dimitrios Skarlatos <dskarlat@...cmu.edu>,
        Jack Chen <jianyan2@...inois.edu>,
        Josep Torrellas <torrella@...inois.edu>,
        Tianyin Xu <tyxu@...inois.edu>,
        kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 0/6] seccomp: Implement constant action bitmaps

On Thu, Sep 24, 2020 at 8:46 AM Rasmus Villemoes
<linux@...musvillemoes.dk> wrote:
> But one thing I'm wondering about and I haven't seen addressed anywhere:
> Why build the bitmap on the kernel side (with all the complexity of
> having to emulate the filter for all syscalls)? Why can't userspace just
> hand the kernel "here's a new filter: the syscalls in this bitmap are
> always allowed noquestionsasked, for the rest, run this bpf". Sure, that
> might require a new syscall or extending seccomp(2) somewhat, but isn't
> that a _lot_ simpler? It would probably also mean that the bpf we do get
> handed is a lot smaller. Userspace might need to pass a couple of
> bitmaps, one for each relevant arch, but you get the overall idea.

Perhaps. The thing is, the current API expects any filter attaches to
be "additive". If a new filter gets attached that says "disallow read"
then no matter whatever has been attached already, "read" shall not be
allowed at the next syscall, bypassing all previous allowlist bitmaps
(so you need to emulate the bpf anyways here?). We should also not
have a API that could let anyone escape the secomp jail. Say "prctl"
is permitted but "read" is not permitted, one must not be allowed to
attach a bitmap so that "read" now appears in the allowlist. The only
way this could potentially work is to attach a BPF filter and a bitmap
at the same time in the same syscall, which might mean API redesign?

> I'm also a bit worried about the performance of doing that emulation;
> that's constant extra overhead for, say, launching a docker container.

IMO, launching a docker container is so expensive this should be negligible.

YiFei Zhu