[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG_fn=WrWLC7v8btiZfRSSj1Oj6WymLmwWq4x1Ss3rQ4P0cOOA@mail.gmail.com>
Date: Tue, 18 Dec 2018 15:36:39 +0100
From: Alexander Potapenko <glider@...gle.com>
To: daniel@...earbox.net
Cc: mkubecek@...e.cz, ast@...nel.org,
Dmitriy Vyukov <dvyukov@...gle.com>,
Networking <netdev@...r.kernel.org>
Subject: Re: Self-XORing BPF registers is undefined behavior
On Thu, Dec 13, 2018 at 3:54 PM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> On 12/13/2018 02:18 PM, Daniel Borkmann wrote:
> > On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
> >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@...e.cz> wrote:
> >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> >>>>> Hi BPF maintainers,
> >>>>>
> >>>>> some time ago KMSAN found an issue in BPF code which we decided to
> >>>>> suppress at that point, but now I'd like to bring it to your
> >>>>> attention.
> >>>>> Namely, some BPF programs may contain instructions that XOR a register
> >>>>> with itself.
> >>>>> This effectively results in the following C code:
> >>>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>>>> or
> >>>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> >>>>> being executed.
> >>>>>
> >>>>> According to the C11 standard this is undefined behavior, so KMSAN
> >>>>> reports an error in this case.
> >
> > Can you elaborate / quote the exact bit from C11 that KMSAN is referring
> > to? (The below that Michal was quoting or something else?)
> >
> > Does that only refer to C11 standard? Note that kernel's Makefile +430
> > explicitly states 'std=gnu89' and not 'std=c11' [0].
> >
> > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89
> >
> >>>> Can you quote the part of the standard saying this is undefined
> >>>> behavior? I couldn't find anything else than
> >>>>
> >>>> If the value being stored in an object is read from another object
> >>>> that overlaps in any way the storage of the first object, then the
> >>>> overlap shall be exact and the two objects shall have qualified or
> >>>> unqualified versions of a compatible type; otherwise, the behavior
> >>>> is undefined.
> >>>>
> >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly
> >>>> they mean by "exact overlap" and the standard doesn't seem to define
> >>>> the term but if the two objects are actually the same, they certainly
> >>>> have compatible types.
> >
> > Here is an example for the overlap quoted above; I don't think this
> > applies to our case since it would be "exact". Quote [1]:
> >
> > struct S { int x; int y; };
> > struct T { int z; struct S s; };
> > union U { struct S f ; struct T g; } u;
> >
> > main(){
> > u.f = u.g.s;
> > return 0;
> > }
> >
> > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945
> >
> >>> I think I understand now. You didn't want to say that the statement
> >>>
> >>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>>
> >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
> >>> uninitialized. Right?
> >> Yes. Sorry for being unclear.
> >> By default regs[] is uninitialized, so we need to initialize it before
> >> using the register values.
> >> I am also wondering if it's possible to simply copy the uninitialized
> >> register values from regs[] to the userspace via maps.
>
> Nope, not possible. And to elaborate on cBPF / eBPF cases:
If you mean that it's not possible to generate a eBPF program that
XORs an uninitialized register with itself, you may be actually right.
I've reverted https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d,
and syzkaller couldn't find a program triggering this behavior so far.
Perhaps something had changed in eBPF code since I encountered this error.
I'll be watching the dashboard and will let you know if I have a
reliable reproducer for the aforementioned problem.
Thanks for checking!
> # cat /proc/sys/net/core/bpf_jit_enable
> 0
>
> For the eBPF case, lets take (test_verifier.c):
>
> {
> "xor test",
> .insns = {
> BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
> BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
> BPF_EXIT_INSN(),
> },
> .result = ACCEPT,
> .retval = 0,
> },
>
> # ./test_verifier 0
> #0/u xor test FAIL
> Failed to load prog 'Permission denied'!
> 0: (af) r2 ^= r2
> R2 !read_ok
> #0/p xor test FAIL
> Failed to load prog 'Permission denied'!
> 0: (af) r2 ^= r2
> R2 !read_ok
> Summary: 0 PASSED, 0 SKIPPED, 2 FAILED
>
> And for the initialized case:
>
> {
> "xor test",
> .insns = {
> BPF_MOV64_IMM(BPF_REG_2, 42),
> BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
> BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
> BPF_EXIT_INSN(),
> },
> .result = ACCEPT,
> .retval = 0,
> },
>
> # ./test_verifier 0
> #0/u xor test OK
> #0/p xor test OK
> Summary: 2 PASSED, 0 SKIPPED, 0 FAILED
>
> And cBPF (test_bpf.c):
>
> {
> "xor test",
> .u.insns = {
> BPF_STMT(BPF_RET | BPF_A, 0)
> },
> CLASSIC,
> { },
> { { 0, 0 }, },
> },
>
> [...]
> [88819.276142] test_bpf: #0 xor test jited:0 PASS
> [...]
>
> Given latter two use the same underlying interpreter implementation for
> the 'good' and 'bad' case, I don't see how compiler would have any degree
> of freedom to judge about the runtime dependent BPF insn input stream
> here. If it would assume UB, then also 'good' xor case would be broken
> and miscompiled which it is clearly not (even for some very obvious case
> below). While KMSAN is analyzing runtime memory access, compiler cannot..
IIUC some hardware, like IA64, may do these checks at runtime (see
https://blogs.msdn.microsoft.com/oldnewthing/20150727-00/?p=90821 for
an example)
Please note that referring to results of compiling a finite number of
programs with UB with a finite number of compilers doesn't prove
anything.
It won't help much in the case tomorrow e.g. GCC changes its behavior
to make use of some new hardware feature.
> # cat foo.c
> #include <stdio.h>
> int main(void)
> {
> int foo = foo ^ foo;
> printf("%u\n", foo);
> return 0;
> }
>
> # clang -Wall -O2 foo.c
> foo.c:4:12: warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized]
> int foo = foo ^ foo;
> ~~~ ^~~
> # ./a.out
> 0
>
> disasm:
>
> 00000000004004d0 <main>:
> 4004d0: 50 push %rax
> 4004d1: bf 74 05 40 00 mov $0x400574,%edi
> 4004d6: 31 f6 xor %esi,%esi
> 4004d8: 31 c0 xor %eax,%eax
> 4004da: e8 f1 fe ff ff callq 4003d0 <printf@plt>
> 4004df: 31 c0 xor %eax,%eax
> 4004e1: 59 pop %rcx
> 4004e2: c3 retq
> 4004e3: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> 4004ea: 00 00 00
> 4004ed: 0f 1f 00 nopl (%rax)
>
> # gcc -Wall -O2 foo.c (neither complains with using -Wextra)
> # ./a.out
> 0
>
> disasm:
>
> 0000000000000560 <main>:
> 560: 48 8d 35 ad 01 00 00 lea 0x1ad(%rip),%rsi # 714 <_IO_stdin_used+0x4>
> 567: 48 83 ec 08 sub $0x8,%rsp
> 56b: 31 d2 xor %edx,%edx
> 56d: bf 01 00 00 00 mov $0x1,%edi
> 572: 31 c0 xor %eax,%eax
> 574: e8 c7 ff ff ff callq 540 <__printf_chk@plt>
> 579: 31 c0 xor %eax,%eax
> 57b: 48 83 c4 08 add $0x8,%rsp
> 57f: c3 retq
>
> Cheers,
> Daniel
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-Straße, 33
80636 München
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Powered by blists - more mailing lists