linux-kernel - Re: [PATCH bpf-next v2 1/2] bpf: Fix a umin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9aa7b0ed-dfe0-325a-ad22-94a30d167cda@huaweicloud.com>
Date:   Tue, 21 Mar 2023 16:28:27 +0800
From:   Xu Kuohai <xukuohai@...weicloud.com>
To:     Daniel Borkmann <daniel@...earbox.net>,
        Xu Kuohai <xukuohai@...weicloud.com>, bpf@...r.kernel.org,
        linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     Alexei Starovoitov <ast@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>
Subject: Re: [PATCH bpf-next v2 1/2] bpf: Fix a umin > umax reg bound error

On 3/21/2023 12:42 AM, Daniel Borkmann wrote:
> On 3/17/23 11:24 PM, Daniel Borkmann wrote:
>> On 3/14/23 9:34 PM, Xu Kuohai wrote:
>>> From: Xu Kuohai <xukuohai@...wei.com>
>>>
>>> After commit 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking"),
>>> the following bpf prog is rejected:
>>>
>>> 0: (61) r2 = *(u32 *)(r1 +0)          ; R2_w=pkt(off=0,r=0,imm=0)
>>> 1: (61) r3 = *(u32 *)(r1 +4)          ; R3_w=pkt_end(off=0,imm=0)
>>> 2: (bf) r1 = r2
>>> 3: (07) r1 += 1
>>> 4: (2d) if r1 > r3 goto pc+8
>>> 5: (71) r1 = *(u8 *)(r2 +0)           ; R1_w=scalar(umax=255,var_off=(0x0; 0xff))
>>> 6: (18) r0 = 0x7fffffffffffff10
>>> 8: (0f) r1 += r0                      ; R1_w=scalar(umin=0x7fffffffffffff10,umax=0x800000000000000f)
>>> 9: (18) r0 = 0x8000000000000000
>>> 11: (07) r0 += 1
>>> 12: (ad) if r0 < r1 goto pc-2
>>> 13: (b7) r0 = 0
>>> 14: (95) exit
>>>
>>> And the verifier log says:
>>>
>>> [...]
>>>
>>> from 12 to 11: R0_w=-9223372036854775794 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
>>> 11: (07) r0 += 1                      ; R0_w=-9223372036854775793
>>> 12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
>>> 13: safe
>>>
>>> from 12 to 11: R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
>>> 11: (07) r0 += 1                      ; R0_w=-9223372036854775792
>>> 12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775792 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
>>> 13: safe
>>>
>>> [...]
>>>
>>> What can be seen here is that r1->umin grows blindly and becomes bigger
>>> than r1->umax. The reason is because the loop does not terminate, when
>>> r0 increases to r1->umax_value, the following code in reg_set_min_max()
>>> sets r1->umin_value to r1->umax_value + 1 blindly:
>>>
>>> case BPF_JGT:
>>> {
>>>          if (is_jmp32) {
>>>                  [...]
>>>          } else {
>>>                  u64 false_umax = opcode == BPF_JGT ? val    : val - 1;
>>>                  u64 true_umin = opcode == BPF_JGT ? val + 1 : val;
>>>
>>>                  false_reg->umax_value = min(false_reg->umax_value, false_umax);
>>>                  true_reg->umin_value = max(true_reg->umin_value, true_umin);
>>>          }
>>>          break;
>>> }
>>>
>>> Why the loop does not terminate is because tnum_is_const(src_reg->var_off)
>>> always returns false, causing is_branch_taken() to be skipped:
>>>
>>> if (src_reg->type == SCALAR_VALUE &&
>>>        !is_jmp32 && tnum_is_const(src_reg->var_off)) {
>>>     pred = is_branch_taken(dst_reg,   // could not reach here
>>>                    src_reg->var_off.value,
>>>                    opcode,
>>>                    is_jmp32);
>>> }
>>>
>>> Why tnum_is_const(src_reg->var_off) always returns false is because
>>> r1->umin_value starts increasing from 0x7fffffffffffff10, always bigger
>>> than U32_MAX, causing the __reg_combine_64_into_32() to mark the lower
>>> 32 bits unbounded, i.e. not a constant.
>>>
>>> To fix it:
>>> 1. avoid increasing reg lower bound to a value bigger than the upper bound,
>>>     or decreasing reg upper bound to a value smaller than the lower bound.
>>> 2. set 32-bit min/max values to the lower 32 bits of the 64-bit min/max values
>>>     when the 64-bit min/max values are equal.
>>
>> Should both these be separate patches, meaning are both of them strictly
>> required as one logical entity or not? From your description it's not really
>> clear wrt reg_{inc,dec}_{u32,u64}_{min,max} and if this is mainly defensive
>> or required.
> 
> Fyi, I'm working on the below draft patch which passes all of test_verifier and
> your test cases as well from patch 2. Will cook a proper patch once I'm through
> with further analysis:
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d517d13878cf..8bef2ed89e87 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1823,7 +1823,7 @@ static void __reg_bound_offset(struct bpf_reg_state *reg)
>          struct tnum var64_off = tnum_intersect(reg->var_off,
>                                                 tnum_range(reg->umin_value,
>                                                            reg->umax_value));
> -       struct tnum var32_off = tnum_intersect(tnum_subreg(reg->var_off),
> +       struct tnum var32_off = tnum_intersect(tnum_subreg(var64_off),
>                                                  tnum_range(reg->u32_min_value,
>                                                             reg->u32_max_value));
> .

[forget to reply to the list, resend]

Thanks for the patch, it works for me. But as replied in the other mail,
it seems more reasonable to converge var32_off to constant by converging
[u32_min_value, u32_max_value] to constant.