netdev - Re: [PATCH bpf-next v2] bpf, seccomp: fix false positive preemption splat for cbpf->ebpf progs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5jKE3PzxNDEMagWFG+37_6FLyQgkLRShHQHB7Ufq-8egDA@mail.gmail.com>
Date:   Thu, 21 Feb 2019 11:53:06 -0800
From:   Kees Cook <keescook@...omium.org>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Jann Horn <jannh@...gle.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andy Lutomirski <luto@...capital.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2] bpf, seccomp: fix false positive preemption
 splat for cbpf->ebpf progs

On Thu, Feb 21, 2019 at 11:29 AM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Thu, Feb 21, 2019 at 01:56:53PM +0100, Jann Horn wrote:
> > On Thu, Feb 21, 2019 at 9:53 AM Daniel Borkmann <daniel@...earbox.net> wrote:
> > > On 02/21/2019 06:31 AM, Kees Cook wrote:
> > > > On Wed, Feb 20, 2019 at 8:03 PM Alexei Starovoitov
> > > > <alexei.starovoitov@...il.com> wrote:
> > > >>
> > > >> On Wed, Feb 20, 2019 at 3:59 PM Alexei Starovoitov
> > > >> <alexei.starovoitov@...il.com> wrote:
> > > >>>
> > > >>> On Thu, Feb 21, 2019 at 12:01:35AM +0100, Daniel Borkmann wrote:
> > > >>>> In 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
> > > >>>> a check was added for BPF_PROG_RUN() that for every invocation preemption is
> > > >>>> disabled to not break eBPF assumptions (e.g. per-cpu map). Of course this does
> > > >>>> not count for seccomp because only cBPF -> eBPF is loaded here and it does
> > > >>>> not make use of any functionality that would require this assertion. Fix this
> > > >>>> false positive by adding and using SECCOMP_RUN() variant that does not have
> > > >>>> the cant_sleep(); check.
> > > >>>>
> > > >>>> Fixes: 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
> > > >>>> Reported-by: syzbot+8bf19ee2aa580de7a2a7@...kaller.appspotmail.com
> > > >>>> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> > > >>>> Acked-by: Kees Cook <keescook@...omium.org>
> > > >>>
> > > >>> Applied, Thanks
> > > >>
> > > >> Actually I think it's a wrong approach to go long term.
> > > >> I'm thinking to revert it.
> > > >> I think it's better to disable preemption for duration of
> > > >> seccomp cbpf prog.
> > > >> It's short and there is really no reason for it to be preemptible.
> > > >> When seccomp switches to ebpf we'll have this weird inconsistency.
> > > >> Let's just disable preemption for seccomp as well.
> > > >
> > > > A lot of changes will be needed for seccomp ebpf -- not the least of
> > > > which is convincing me there is a use-case. ;)
> > > >
> > > > But the main issue is that I'm not a huge fan of dropping two
> > > > barriers() across syscall entry. That seems pretty heavy-duty for
> > > > something that is literally not needed right now.
> > >
> > > Yeah, I think it's okay to add once actually technically needed. Last
> > > time I looked, if I recall correctly, at least Chrome installs some
> > > heavy duty seccomp programs that go close to prog limit.
> >
> > Half of that is probably because that seccomp BPF code is so
> > inefficient, though.
> >
> > This snippet shows that those programs constantly recheck the high
> > halves of arguments:
> >
> > Some of the generated code is pointless because all reachable code
> > from that point on has the same outcome (the last "ret ALLOW" in the
> > following sample is unreachable because they've already checked that
> > the high bit of the low half is set, so the low half can't be 3):
>
> and with ebpf these optimizations will be available for free
> because llvm will remove unnecessary loads and simplify branches.
> There is no technical reason not to use ebpf in seccomp.
>
> When we discussed preemption of classic vs extended in socket filters
> context we agreed to make it a requirement that preemption must be
> disabled though it's not strictly necessary. RX side of socket filters
> was already non-preempt while TX was preemptible.
> We must not make an exception of this rule for seccomp.
> Hence I've reverted this commit.
>
> Here is the actual fix for seccomp:
> From: Alexei Starovoitov <ast@...nel.org>
> Date: Thu, 21 Feb 2019 10:40:14 -0800
> Subject: [PATCH] seccomp, bpf: disable preemption before calling into bpf prog
>
> All BPF programs must be called with preemption disabled.
>
> Signed-off-by: Alexei Starovoitov <ast@...nel.org>
> ---
>  kernel/seccomp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index e815781ed751..a43c601ac252 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -267,6 +267,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd,
>          * All filters in the list are evaluated and the lowest BPF return
>          * value always takes priority (ignoring the DATA).
>          */
> +       preempt_disable();
>         for (; f; f = f->prev) {
>                 u32 cur_ret = BPF_PROG_RUN(f->prog, sd);
>
> @@ -275,6 +276,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd,
>                         *match = f;
>                 }
>         }
> +       preempt_enable();
>         return ret;
>  }
>  #endif /* CONFIG_SECCOMP_FILTER */
> --
>
> Doing per-cpu increment of cache hot data is practically free and it makes seccomp
> play by the rules.

Other accesses should dominate the run time, yes. I'm still not a big
fan of unconditionally adding this, but I won't NAK. :P

-- 
Kees Cook