lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202009242000.DE12689BD8@keescook>
Date:   Thu, 24 Sep 2020 20:09:07 -0700
From:   Kees Cook <keescook@...omium.org>
To:     YiFei Zhu <zhuyifei1999@...il.com>
Cc:     YiFei Zhu <yifeifz2@...inois.edu>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        bpf <bpf@...r.kernel.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Aleksa Sarai <cyphar@...har.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andy Lutomirski <luto@...capital.net>,
        Dimitrios Skarlatos <dskarlat@...cmu.edu>,
        Giuseppe Scrivano <gscrivan@...hat.com>,
        Hubertus Franke <frankeh@...ibm.com>,
        Jack Chen <jianyan2@...inois.edu>,
        Jann Horn <jannh@...gle.com>,
        Josep Torrellas <torrella@...inois.edu>,
        Tianyin Xu <tyxu@...inois.edu>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Tycho Andersen <tycho@...ho.pizza>,
        Valentin Rothberg <vrothber@...hat.com>,
        Will Drewry <wad@...omium.org>
Subject: Re: [PATCH v2 seccomp 2/6] asm/syscall.h: Add syscall_arches[] array

On Thu, Sep 24, 2020 at 08:27:40PM -0500, YiFei Zhu wrote:
> [resending this too]
> 
> On Thu, Sep 24, 2020 at 6:01 PM Kees Cook <keescook@...omium.org> wrote:
> > Disregarding the "how" of this, yeah, we'll certainly need something to
> > tell seccomp about the arrangement of syscall tables and how to find
> > them.
> >
> > However, I'd still prefer to do this on a per-arch basis, and include
> > more detail, as I've got in my v1.
> >
> > Something missing from both styles, though, is a consolidation of
> > values, where the AUDIT_ARCH* isn't reused in both the seccomp info and
> > the syscall_get_arch() return. The problems here were two-fold:
> >
> > 1) putting this in syscall.h meant you do not have full NR_syscall*
> >    visibility on some architectures (e.g. arm64 plays weird games with
> >    header include order).
> 
> I don't get this one -- I'm not playing with NR_syscall here.

Right, sorry, I may not have been clear. When building my RFC I noticed
that I couldn't use NR_syscall very "early" in the header file include
stack on arm64, which complicated things. So I guess what I mean is
something like "it's probably better to do all these seccomp-specific
macros/etc in asm/include/seccomp.h rather than in syscall.h because I
know at least one architecture that might cause trouble."

> > 2) seccomp needs to handle "multiplexed" tables like x86_x32 (distros
> >    haven't removed CONFIG_X86_X32 widely yet, so it is a reality that
> >    it must be dealt with), which means seccomp's idea of the arch
> >    "number" can't be the same as the AUDIT_ARCH.
> 
> Why so? Does anyone actually use x32 in a container? The memory cost
> and analysis cost is on everyone. The worst case scenario if we don't
> support it is that the syscall is not accelerated.

Ironicailly, that's the only place I actually know for sure where people
using x32 because it shows measurable (10%) speed-up for builders:
https://lore.kernel.org/lkml/CAOesGMgu1i3p7XMZuCEtj63T-ST_jh+BfaHy-K6LhgqNriKHAA@mail.gmail.com

So, yes, as you and Jann both point out, it wouldn't be terrible to just
ignore x32, it seems a shame to penalize it. That said, if the masking
step from my v1 is actually noticable on a native workload, then yeah,
probably x32 should be ignored. My instinct (not measured) is that it's
faster than walking a small array.[citation needed]

> > So, likely a combo of approaches is needed: an array (or more likely,
> > enum), declared in the per-arch seccomp.h file. And I don't see a way
> > to solve #1 cleanly.
> >
> > Regardless, it needs to be split per architecture so that regressions
> > can be bisected/reverted/isolated cleanly. And if we can't actually test
> > it at runtime (or find someone who can) it's not a good idea to make the
> > change. :)
> 
> You have a good point regarding tests. Don't see how it affects
> regressions though. Only one file here is ever included per-build.

It's easier to do a per-arch revert (i.e. all the -stable tree
machinery, etc) with a single SHA instead of having to write a partial
revert, etc.

-- 
Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ