[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez1Edngv8LLb_svz8Kq3vQKf7-a_Q1whyHfXiTDP8FO+rw@mail.gmail.com>
Date: Thu, 21 Feb 2019 13:56:53 +0100
From: Jann Horn <jannh@...gle.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: Kees Cook <keescook@...omium.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Andy Lutomirski <luto@...capital.net>,
Alexei Starovoitov <ast@...nel.org>,
Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2] bpf, seccomp: fix false positive preemption
splat for cbpf->ebpf progs
On Thu, Feb 21, 2019 at 9:53 AM Daniel Borkmann <daniel@...earbox.net> wrote:
> On 02/21/2019 06:31 AM, Kees Cook wrote:
> > On Wed, Feb 20, 2019 at 8:03 PM Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> >>
> >> On Wed, Feb 20, 2019 at 3:59 PM Alexei Starovoitov
> >> <alexei.starovoitov@...il.com> wrote:
> >>>
> >>> On Thu, Feb 21, 2019 at 12:01:35AM +0100, Daniel Borkmann wrote:
> >>>> In 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
> >>>> a check was added for BPF_PROG_RUN() that for every invocation preemption is
> >>>> disabled to not break eBPF assumptions (e.g. per-cpu map). Of course this does
> >>>> not count for seccomp because only cBPF -> eBPF is loaded here and it does
> >>>> not make use of any functionality that would require this assertion. Fix this
> >>>> false positive by adding and using SECCOMP_RUN() variant that does not have
> >>>> the cant_sleep(); check.
> >>>>
> >>>> Fixes: 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
> >>>> Reported-by: syzbot+8bf19ee2aa580de7a2a7@...kaller.appspotmail.com
> >>>> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> >>>> Acked-by: Kees Cook <keescook@...omium.org>
> >>>
> >>> Applied, Thanks
> >>
> >> Actually I think it's a wrong approach to go long term.
> >> I'm thinking to revert it.
> >> I think it's better to disable preemption for duration of
> >> seccomp cbpf prog.
> >> It's short and there is really no reason for it to be preemptible.
> >> When seccomp switches to ebpf we'll have this weird inconsistency.
> >> Let's just disable preemption for seccomp as well.
> >
> > A lot of changes will be needed for seccomp ebpf -- not the least of
> > which is convincing me there is a use-case. ;)
> >
> > But the main issue is that I'm not a huge fan of dropping two
> > barriers() across syscall entry. That seems pretty heavy-duty for
> > something that is literally not needed right now.
>
> Yeah, I think it's okay to add once actually technically needed. Last
> time I looked, if I recall correctly, at least Chrome installs some
> heavy duty seccomp programs that go close to prog limit.
Half of that is probably because that seccomp BPF code is so
inefficient, though.
This snippet shows that those programs constantly recheck the high
halves of arguments:
014e if args[1].high == 0x00000000: [true +3, false +0]
0153 if args[1].low == 0x00000003: [true +135, false +0] -> ret
ALLOW (syscalls: fcntl)
0155 if args[1].high == 0x00000000: [true +3, false +0]
015a if args[1].low == 0x00000001: [true +128, false +0] -> ret
ALLOW (syscalls: fcntl)
015c if args[1].high == 0x00000000: [true +3, false +0]
0161 if args[1].low == 0x00000002: [true +121, false +0] -> ret
ALLOW (syscalls: fcntl)
0163 if args[1].high == 0x00000000: [true +3, false +0]
0168 if args[1].low == 0x00000006: [true +114, false +0] ->
ret ALLOW (syscalls: fcntl)
016a if args[1].high == 0x00000000: [true +3, false +0]
016f if args[1].low == 0x00000007: [true +107, false +0] ->
ret ALLOW (syscalls: fcntl)
0171 if args[1].high == 0x00000000: [true +3, false +0]
0176 if args[1].low == 0x00000005: [true +100, false +0]
-> ret ALLOW (syscalls: fcntl)
0178 if args[1].high == 0x00000000: [true +3, false +0]
017d if args[1].low == 0x00000000: [true +93, false +0]
-> ret ALLOW (syscalls: fcntl)
017f if args[1].high == 0x00000000: [true +3, false +0]
0184 if args[1].low == 0x00000406: [true +86, false
+0] -> ret ALLOW (syscalls: fcntl)
0186 if args[1].high == 0x00000000: [true +3, false +0]
018b if args[1].low != 0x00000004: [true +80, false
+0] -> ret TRAP
018d if args[2].high != 0x00000000: [true +78, false
+0] -> ret TRAP
018f if args[2].low COMMON-BITS 0xffe363fc: [true
+76, false +75] -> ret TRAP
01db ret ALLOW (syscalls: fcntl)
Some of the generated code is pointless because all reachable code
from that point on has the same outcome (the last "ret ALLOW" in the
following sample is unreachable because they've already checked that
the high bit of the low half is set, so the low half can't be 3):
00c9 if args[0].low == 0x00000005: [true +16, false +0] ->
ret ALLOW (syscalls: clock_gettime, clock_getres)
00cb if args[0].high == 0x00000000: [true +3, false +0]
00d0 if args[0].low == 0x00000003: [true +9, false +0] ->
ret ALLOW (syscalls: clock_gettime, clock_getres)
01dc ret TRAP
00cc if args[0].high != 0xffffffff: [true +8, false +0] -> ret TRAP
00ce if args[0].low COMMON-BITS 0x80000000: [true +0, false +6]
00d0 if args[0].low == 0x00000003: [true +9, false +0] ->
ret ALLOW (syscalls: clock_gettime, clock_getres)
01dc ret TRAP
01d7 ret TRAP
Powered by blists - more mailing lists