linux-kernel - Re: [Bug Report] bpf: incorrectly pruning runtime execution path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <480a5cfefc23446f7c82c5b87eef6306364132b9.camel@gmail.com>
Date:   Thu, 14 Dec 2023 01:35:26 +0200
From:   Eduard Zingerman <eddyz87@...il.com>
To:     Hao Sun <sunhao.th@...il.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        bpf <bpf@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [Bug Report] bpf: incorrectly pruning runtime execution path

On Wed, 2023-12-13 at 11:25 +0100, Hao Sun wrote:
[...]
 
> I tried to convert the repro to a valid test case in inline asm, but seems
> JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
> Will try after clang-18 is released.
> 
> #30 is expected to be executed, see below where everything after ";" is
> the runtime value:
>    ...
>    6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
>    ...
>   11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
>   ...
>   18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
>   ...
>   23: (bf) r5 = r8     ; w5 = 0xbe
>   24: (18) r2 = 0x4
>   26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
>   27: (4f) r8 |= r8
>   28: (0f) r8 += r8
>   29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
>   30: (18) r0 = 0x4      ; executed
> 
> Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
> is executed after manually commenting out the dead code rewrite pass.
> 
> From my understanding, I think r0 should be marked as precise when
> first backtrack from #29, because r5 range at this point depends on w0
> as r8 and r5 share the same id at #26.

Hi Hao, Andrii,

I converted program in question to a runnable test, here is a link to
the patch adding it and disabling dead code removal:
https://gist.github.com/eddyz87/e888ad70c947f28f94146a47e33cd378

Run the test as follows:
  ./test_progs -vvv -a verifier_and/pruning_test

And inspect the retval:
  do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
  run_subtest:FAIL:647 Unexpected retval: 1353935089 != 4

Note that I tried this test with two functions:
- bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
- bpf_get_prandom_u32, with this function I get a random retval each time.

What is the expectation when 'bpf_get_current_cgroup_id' is used?
That it is some known (to us) number, but verifier treats it as unknown scalar?

Also, I find this portion of the verification log strange:

    ...
    13: (0f) r0 += r0                     ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,
                                                        var_off=(0x0; 0x3))
    14: (2f) r4 *= r4                     ; R4_w=scalar()
    15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
    17: (c4) w4 s>>= 29                   ; R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,
                                                        var_off=(0x0; 0xffffffff))
    18: (56) if w8 != 0xf goto pc+3       ; R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,
                                                        umin=smin32=umin32=15,umax=0xffffffff0000000f,
                                                        smax32=umax32=15,var_off=(0xf; 0xffffffff00000000))
    19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
    20: (18) r2 = 0x1c                    ; R2=28
    22: (67) r4 <<= 2                     ; R4_w=scalar(smin=0,smax=umax=0x3fffffffc,
                                                        smax32=0x7ffffffc,umax32=0xfffffffc,
                                                        var_off=(0x0; 0x3fffffffc))
    23: (bf) r5 = r8                      ; R5_w=scalar(id=1,smin=0x800000000000000f,
                                                        smax=0x7fffffff0000000f,
                                                        umin=smin32=umin32=15,
                                                        umax=0xffffffff0000000f,
                                                        smax32=umax32=15,
                                                        var_off=(0xf; 0xffffffff00000000))
                                            R8=scalar(id=1,smin=0x800000000000000f,
                                                      smax=0x7fffffff0000000f,
                                                      umin=smin32=umin32=15,
                                                      umax=0xffffffff0000000f,
                                                      smax32=umax32=15,
                                                      var_off=(0xf; 0xffffffff00000000))
    24: (18) r2 = 0x4                     ; R2_w=4
    26: (7e) if w8 s>= w0 goto pc+5
    mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1 
    mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
    ...                   ^^^^^^^^^^
                          ^^^^^^^^^^
Here w8 == 15, w0 in range [0, 2], so the jump is being predicted,
but for some reason R0 is not among the registers that would be marked precise.