linux-kernel - Re: [RFC PATCH seccomp 1/2] seccomp/cache: Add "emulator" to check if filter is arg-dependent

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAG48ez0gBRvTEXX_L3881jQM8Aw6SURbMPafW18GihWe4ZmtmQ@mail.gmail.com>
Date:   Mon, 21 Sep 2020 20:38:11 +0200
From:   Jann Horn <jannh@...gle.com>
To:     YiFei Zhu <zhuyifei1999@...il.com>
Cc:     Linux Containers <containers@...ts.linux-foundation.org>,
        YiFei Zhu <yifeifz2@...inois.edu>, bpf <bpf@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Dimitrios Skarlatos <dskarlat@...cmu.edu>,
        Giuseppe Scrivano <gscrivan@...hat.com>,
        Hubertus Franke <frankeh@...ibm.com>,
        Jack Chen <jianyan2@...inois.edu>,
        Josep Torrellas <torrella@...inois.edu>,
        Kees Cook <keescook@...omium.org>,
        Tianyin Xu <tyxu@...inois.edu>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Valentin Rothberg <vrothber@...hat.com>,
        Andy Lutomirski <luto@...capital.net>,
        Will Drewry <wad@...omium.org>,
        Aleksa Sarai <cyphar@...har.com>,
        kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH seccomp 1/2] seccomp/cache: Add "emulator" to check if
 filter is arg-dependent

On Mon, Sep 21, 2020 at 7:47 PM Jann Horn <jannh@...gle.com> wrote:
> On Mon, Sep 21, 2020 at 7:35 AM YiFei Zhu <zhuyifei1999@...il.com> wrote:
> > SECCOMP_CACHE_NR_ONLY will only operate on syscalls that do not
> > access any syscall arguments or instruction pointer. To facilitate
> > this we need a static analyser to know whether a filter will
> > access. This is implemented here with a pseudo-emulator, and
> > stored in a per-filter bitmap. Each seccomp cBPF instruction,
> > aside from ALU (which should rarely be used in seccomp), gets a
> > naive best-effort emulation for each syscall number.
> >
> > The emulator works by following all possible (without SAT solving)
> > paths the filter can take. Every cBPF register / memory position
> > records whether that is a constant, and of so, the value of the
> > constant. Loading from struct seccomp_data is considered constant
> > if it is a syscall number, else it is an unknown. For each
> > conditional jump, if the both arguments can be resolved to a
> > constant, the jump is followed after computing the result of the
> > condition; else both directions are followed, by pushing one of
> > the next states to a linked list of next states to process. We
> > keep a finite number of pending states to process.
>
> Is this actually necessary, or can we just bail out on any branch that
> we can't statically resolve?

Aaaah, now I get what's going on. You statically compute a bitmask
that says whether a given syscall number always has a fixed result
*per architecture number*, and then use that later to decide whether
results can be cached for the combination of a specific seccomp filter
and a specific architecture number. Which mostly works, except that it
means you end up with weird per-thread caches and you get interference
between ABIs (so if a process e.g. filters the argument numbers for
syscall 123 in ABI 1, the results for syscall 123 in ABI 2 also can't
be cached).

Anyway, even though this works, I think it's the wrong way to go about it.