[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0gBRvTEXX_L3881jQM8Aw6SURbMPafW18GihWe4ZmtmQ@mail.gmail.com>
Date: Mon, 21 Sep 2020 20:38:11 +0200
From: Jann Horn <jannh@...gle.com>
To: YiFei Zhu <zhuyifei1999@...il.com>
Cc: Linux Containers <containers@...ts.linux-foundation.org>,
YiFei Zhu <yifeifz2@...inois.edu>, bpf <bpf@...r.kernel.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Dimitrios Skarlatos <dskarlat@...cmu.edu>,
Giuseppe Scrivano <gscrivan@...hat.com>,
Hubertus Franke <frankeh@...ibm.com>,
Jack Chen <jianyan2@...inois.edu>,
Josep Torrellas <torrella@...inois.edu>,
Kees Cook <keescook@...omium.org>,
Tianyin Xu <tyxu@...inois.edu>,
Tobin Feldman-Fitzthum <tobin@....com>,
Valentin Rothberg <vrothber@...hat.com>,
Andy Lutomirski <luto@...capital.net>,
Will Drewry <wad@...omium.org>,
Aleksa Sarai <cyphar@...har.com>,
kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH seccomp 1/2] seccomp/cache: Add "emulator" to check if
filter is arg-dependent
On Mon, Sep 21, 2020 at 7:47 PM Jann Horn <jannh@...gle.com> wrote:
> On Mon, Sep 21, 2020 at 7:35 AM YiFei Zhu <zhuyifei1999@...il.com> wrote:
> > SECCOMP_CACHE_NR_ONLY will only operate on syscalls that do not
> > access any syscall arguments or instruction pointer. To facilitate
> > this we need a static analyser to know whether a filter will
> > access. This is implemented here with a pseudo-emulator, and
> > stored in a per-filter bitmap. Each seccomp cBPF instruction,
> > aside from ALU (which should rarely be used in seccomp), gets a
> > naive best-effort emulation for each syscall number.
> >
> > The emulator works by following all possible (without SAT solving)
> > paths the filter can take. Every cBPF register / memory position
> > records whether that is a constant, and of so, the value of the
> > constant. Loading from struct seccomp_data is considered constant
> > if it is a syscall number, else it is an unknown. For each
> > conditional jump, if the both arguments can be resolved to a
> > constant, the jump is followed after computing the result of the
> > condition; else both directions are followed, by pushing one of
> > the next states to a linked list of next states to process. We
> > keep a finite number of pending states to process.
>
> Is this actually necessary, or can we just bail out on any branch that
> we can't statically resolve?
Aaaah, now I get what's going on. You statically compute a bitmask
that says whether a given syscall number always has a fixed result
*per architecture number*, and then use that later to decide whether
results can be cached for the combination of a specific seccomp filter
and a specific architecture number. Which mostly works, except that it
means you end up with weird per-thread caches and you get interference
between ABIs (so if a process e.g. filters the argument numbers for
syscall 123 in ABI 1, the results for syscall 123 in ABI 2 also can't
be cached).
Anyway, even though this works, I think it's the wrong way to go about it.
Powered by blists - more mailing lists