netdev - Re: Self-XORing BPF registers is undefined behavior

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <984adc13-568e-8195-da1a-05135dbf954f@solarflare.com>
Date:   Fri, 29 May 2020 01:17:03 +0100
From:   Edward Cree <ecree@...arflare.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Alexander Potapenko <glider@...gle.com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Michal Kubecek <mkubecek@...e.cz>,
        Alexei Starovoitov <ast@...nel.org>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        Networking <netdev@...r.kernel.org>
Subject: Re: Self-XORing BPF registers is undefined behavior

On 28/05/2020 17:00, Alexei Starovoitov wrote:
> xoring of two identical values is undefined in standard?
I believe it is in this case, yes; even without the complication
 of array references that happen to alias, Alexander's foo1() is
 undefined behaviour under C89 (and also C99 which handles the
 case differently).

>From the definitions section (1.6) of the C89 draft [1]:
> * Undefined behavior --- behavior, upon use of a nonportable or
> erroneous program construct, of erroneous data, or of
> indeterminately-valued objects, for which the Standard imposes
> no requirements.
And from 3.5.7 'Initialization':
> If an object that has automatic storage duration is not
> initialized explicitly, its value is indeterminate.
Since the standard doesn't say anything about self-XORing that
 could make it 'special' in this regard, the compiler isn't
 required to notice that it's a self-XOR, and (in the tradition
 of compiler-writers the world over) is entitled to optimise the
 program based on the assumption that the programmer has not
 committed UB, so in the foo1() example would be strictly within
 its rights to generate a binary that contained no XOR
 instruction at all.  UB, as you surely know, isn't guaranteed to
 do something 'sensible'.
And in the BPF example, if the compiler at some point manages to
 statically figure out that regs[insn->dst_reg] is uninitialised,
 it might say "hey, I can just grab any old free register and
 declare that that's now regs[insn->dst_reg] without filling it.
 And then it can do the same for regs[insn->src_reg], or heck,
 even choose to fill that one (this is now legal even though the
 pointers alias, because you already committed UB), and do a xor
 with different regs and produce garbage results.

(In C99 it gets subtler because an 'indeterminate value' is
 defined to be 'either a valid value or a trap representation',
 so arguably the compiler can only do this stuff if it _has_
 trap representations for the type in question.)

> If that's really true such standard worth nothing.
You may be right, but plenty of compiler writers will take that
 as a reason to ignore you, and if (say) a gcc upgrade breaks
 filter.c, they will merrily close any bugs you file as NOTABUG
 or INVALID or GOAWAYWEDONTCARE.
Is this annoying?  Extremely; the XOR-clearing _would_ be fine
 if the standard had chosen to define things differently (e.g.
 it's fine under a hypothetical 'C99 but uninitialised auto
 variables have unspecified rather than indeterminate values').
I can't see a way to work around it that doesn't have a possible
 performance cost (alternatives to Alexander's MOV_IMM 0 include
 initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME
 and PROG_NAME_ARGS), although there is the question of whether
 anyone who cares about performance (or security) will be using
 BPF without the JIT anyway.
But I don't think "Alexandar has to do the data-flow analysis in
 KMSAN" is the right answer; KMSAN's diagnostic here is _correct_
 in that ___bpf_prog_run() invokes UB on this XOR.
Now, since it would be rather difficult and pointless for the
 compiler to statically prove that the reg is uninitialised (it
 would need to generate a special code-path just for this one
 case), maybe the best thing to do is to get GCC folks to bless
 this usage (perhaps defining uninitialised variables to have
 what C99 would call an unspecified value), at which point it
 becomes defined under the "gnu89" pseudo-standard which is what
 we compile the kernel with.  At which point KMSAN can be taught
 that this is OK, and can figure out "hey, you're self-XORing an
 unspecified value, the result is determinate" and clear the
 shadow bits appropriately.

-ed

[1]: https://port70.net/~nsz/c/c89/c89-draft.html