lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 Sep 2020 16:29:17 -0700
From:   Kees Cook <keescook@...omium.org>
To:     YiFei Zhu <yifeifz2@...inois.edu>
Cc:     Kees Cook <keescook@...omium.org>, Jann Horn <jannh@...gle.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Tycho Andersen <tycho@...ho.pizza>,
        Andy Lutomirski <luto@...capital.net>,
        Will Drewry <wad@...omium.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Giuseppe Scrivano <gscrivan@...hat.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Dimitrios Skarlatos <dskarlat@...cmu.edu>,
        Valentin Rothberg <vrothber@...hat.com>,
        Hubertus Franke <frankeh@...ibm.com>,
        Jack Chen <jianyan2@...inois.edu>,
        Josep Torrellas <torrella@...inois.edu>,
        Tianyin Xu <tyxu@...inois.edu>, bpf@...r.kernel.org,
        containers@...ts.linux-foundation.org, linux-api@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [PATCH v1 0/6] seccomp: Implement constant action bitmaps

rfc: https://lore.kernel.org/lkml/20200616074934.1600036-1-keescook@chromium.org/
alternative: https://lore.kernel.org/containers/cover.1600661418.git.yifeifz2@illinois.edu/
v1:
- rebase to for-next/seccomp
- finish X86_X32 support for both pinning and bitmaps
- replace TLB magic with Jann's emulator
- add JSET insn

TODO:
- add ALU|AND insn
- significantly more testing

Hi,

This is a refresh of my earlier constant action bitmap series. It looks
like the RFC was missed on the container list, so I've CCed it now. :)
I'd like to work from this series, as it handles the multi-architecture
stuff.

Repeating the commit log from patch 3:

    seccomp: Implement constant action bitmaps
    
    One of the most common pain points with seccomp filters has been dealing
    with the overhead of processing the filters, especially for "always allow"
    or "always reject" cases. While BPF is extremely fast[1], it will always
    have overhead associated with it. Additionally, due to seccomp's design,
    filters are layered, which means processing time goes up as the number
    of filters attached goes up.
    
    In the past, efforts have been focused on making filter execution complete
    in a shorter amount of time. For example, filters were rewritten from
    using linear if/then/else syscall search to using balanced binary trees,
    or moving tests for syscalls common to the process's workload to the
    front of the filter. However, there are limits to this, especially when
    some processes are dealing with tens of filters[2], or when some
    architectures have a less efficient BPF engine[3].
    
    The most common use of seccomp, constructing syscall block/allow-lists,
    where syscalls that are always allowed or always rejected (without regard
    to any arguments), also tends to produce the most pathological runtime
    problems, in that a large number of syscall checks in the filter need
    to be performed to come to a determination.
    
    In order to optimize these cases from O(n) to O(1), seccomp can
    use bitmaps to immediately determine the desired action. A critical
    observation in the prior paragraph bears repeating: the common case for
    syscall tests do not check arguments. For any given filter, there is a
    constant mapping from the combination of architecture and syscall to the
    seccomp action result. (For kernels/architectures without CONFIG_COMPAT,
    there is a single architecture.). As such, it is possible to construct
    a mapping of arch/syscall to action, which can be updated as new filters
    are attached to a process.
    
    In order to build this mapping at filter attach time, each filter is
    executed for every syscall (under each possible architecture), and
    checked for any accesses of struct seccomp_data that are not the "arch"
    nor "nr" (syscall) members. If only "arch" and "nr" are examined, then
    there is a constant mapping for that syscall, and bitmaps can be updated
    accordingly. If any accesses happen outside of those struct members,
    seccomp must not bypass filter execution for that syscall, since program
    state will be used to determine filter action result. (This logic comes
    in the next patch.)
    
    [1] https://lore.kernel.org/bpf/20200531171915.wsxvdjeetmhpsdv2@ast-mbp.dhcp.thefacebook.com/
    [2] https://lore.kernel.org/bpf/20200601101137.GA121847@gardel-login/
    [3] https://lore.kernel.org/bpf/717a06e7f35740ccb4c70470ec70fb2f@huawei.com/


Thanks!

-Kees


Kees Cook (6):
  seccomp: Introduce SECCOMP_PIN_ARCHITECTURE
  x86: Enable seccomp architecture tracking
  seccomp: Implement constant action bitmaps
  seccomp: Emulate basic filters for constant action results
  selftests/seccomp: Compare bitmap vs filter overhead
  [DEBUG] seccomp: Report bitmap coverage ranges

 arch/x86/include/asm/seccomp.h                |  14 +
 include/linux/seccomp.h                       |  27 +
 include/uapi/linux/seccomp.h                  |   1 +
 kernel/seccomp.c                              | 473 +++++++++++++++++-
 net/core/filter.c                             |   3 +-
 .../selftests/seccomp/seccomp_benchmark.c     | 151 +++++-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  33 ++
 tools/testing/selftests/seccomp/settings      |   2 +-
 8 files changed, 674 insertions(+), 30 deletions(-)

-- 
2.25.1

Powered by blists - more mailing lists