linux-kernel - Re: [Bug Report] bpf: incorrectly pruning runtime execution path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <917DAD9F-8697-45B8-8890-D33393F6CDF1@gmail.com>
Date:   Thu, 14 Dec 2023 10:38:01 +0100
From:   Hao Sun <sunhao.th@...il.com>
To:     Eduard Zingerman <eddyz87@...il.com>
Cc:     Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        bpf <bpf@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [Bug Report] bpf: incorrectly pruning runtime execution path



> On 14 Dec 2023, at 12:35 AM, Eduard Zingerman <eddyz87@...il.com> wrote:
> 
> On Wed, 2023-12-13 at 11:25 +0100, Hao Sun wrote:
> [...]
> 
>> I tried to convert the repro to a valid test case in inline asm, but seems
>> JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
>> Will try after clang-18 is released.
>> 
>> #30 is expected to be executed, see below where everything after ";" is
>> the runtime value:
>>   ...
>>   6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
>>   ...
>>  11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
>>  ...
>>  18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
>>  ...
>>  23: (bf) r5 = r8     ; w5 = 0xbe
>>  24: (18) r2 = 0x4
>>  26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
>>  27: (4f) r8 |= r8
>>  28: (0f) r8 += r8
>>  29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
>>  30: (18) r0 = 0x4      ; executed
>> 
>> Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
>> is executed after manually commenting out the dead code rewrite pass.
>> 
>> From my understanding, I think r0 should be marked as precise when
>> first backtrack from #29, because r5 range at this point depends on w0
>> as r8 and r5 share the same id at #26.
> 
> Hi Hao, Andrii,
> 
> I converted program in question to a runnable test, here is a link to
> the patch adding it and disabling dead code removal:
> https://gist.github.com/eddyz87/e888ad70c947f28f94146a47e33cd378
> 
> Run the test as follows:
>  ./test_progs -vvv -a verifier_and/pruning_test
> 
> And inspect the retval:
>  do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
>  run_subtest:FAIL:647 Unexpected retval: 1353935089 != 4
> 

Thanks for the runnable test! 

The reason why retval checks fails is that the way you disable dead
code removal pass is not complete. Disable opt_remove_dead_code()
just prevent the instruction #30 from being removed, but also note
opt_hard_wire_dead_code_branches(), which convert conditional jump
into unconditional one, so #30 is still skipped.



> Note that I tried this test with two functions:
> - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> - bpf_get_prandom_u32, with this function I get a random retval each time.
> 
> What is the expectation when 'bpf_get_current_cgroup_id' is used?
> That it is some known (to us) number, but verifier treats it as unknown scalar?
> 

Either one would work, but to make #30 always taken, r0 should be
non-zero.