[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f80991aa-3a49-451a-9a82-ac57982dcb28@huaweicloud.com>
Date: Sat, 20 Apr 2024 16:33:15 +0800
From: Xu Kuohai <xukuohai@...weicloud.com>
To: Eduard Zingerman <eddyz87@...il.com>, bpf@...r.kernel.org,
netdev@...r.kernel.org, linux-security-module@...r.kernel.org,
linux-kselftest@...r.kernel.org
Cc: Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <martin.lau@...ux.dev>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Matt Bobrowski <mattbobrowski@...gle.com>,
Brendan Jackman <jackmanb@...omium.org>, Paul Moore <paul@...l-moore.com>,
James Morris <jmorris@...ei.org>, "Serge E . Hallyn" <serge@...lyn.com>,
Khadija Kamran <kamrankhadijadj@...il.com>,
Casey Schaufler <casey@...aufler-ca.com>,
Ondrej Mosnacek <omosnace@...hat.com>, Kees Cook <keescook@...omium.org>,
John Johansen <john.johansen@...onical.com>,
Lukas Bulwahn <lukas.bulwahn@...il.com>,
Roberto Sassu <roberto.sassu@...wei.com>,
Shung-Hsi Yu <shung-hsi.yu@...e.com>
Subject: Re: [PATCH bpf-next v3 07/11] bpf: Fix a false rejection caused by
AND operation
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
> On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@...wei.com>
>>
>> With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts
>> is rejected by the verifier, and the log says:
>>
>> 0: R1=ctx() R10=fp0
>> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>> 0: (b7) r0 = 0 ; R0_w=0
>> 1: (79) r2 = *(u64 *)(r1 +0)
>> func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map'
>> 2: R1=ctx() R2_w=trusted_ptr_bpf_map()
>> ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29
>> 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4)
>> 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4)
>> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>> 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx()
>> ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32
>> 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000))
>> 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0)
>> ; @ test_libbpf_get_fd_by_id_opts.c:0
>> 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
>> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27
>> 9: (95) exit
>>
>> And here is the C code of the prog.
>>
>> SEC("lsm/bpf_map")
>> int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode)
>> {
>> if (map != (struct bpf_map *)&data_input)
>> return 0;
>>
>> if (fmode & FMODE_WRITE)
>> return -EACCES;
>>
>> return 0;
>> }
>>
>> It is clear that the prog can only return either 0 or -EACCESS, and both
>> values are legal.
>>
>> So why is it rejected by the verifier?
>>
>> The verifier log shows that the second if and return value setting
>> statements in the prog is optimized to bitwise operations "r0 s>>= 63"
>> and "r0 &= -13". The verifier correctly deduces that the the value of
>> r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63".
>> But when the verifier proceeds to verify instruction "r0 &= -13", it
>> fails to deduce the correct value range of r0.
>>
>> 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0)
>> 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
>>
>> So why the verifier fails to deduce the result of 'r0 &= -13'?
>>
>> The verifier uses tnum to track values, and the two ranges "[-1, 0]" and
>> "[0, -1ULL]" are encoded to the same tnum. When verifing instruction
>> "r0 &= -13", the verifier erroneously deduces the result from
>> "[0, -1ULL] AND -13", which is out of the expected return range
>> [-4095, 0].
>>
>> To fix it, this patch simply adds a special SCALAR32 case for the
>> verifier. That is, when the source operand of the AND instruction is
>> a constant and the destination operand changes from negative to
>> non-negative and falls in range [-256, 256], deduce the result range
>> by enumerating all possible AND results.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@...wei.com>
>> ---
>
> Hello,
>
> Sorry for the delay, I had to think about this issue a bit.
> I found the clang transformation that generates the pattern this patch
> tries to handle.
> It is located in DAGCombiner::SimplifySelectCC() method (see [1]).
> The transformation happens as a part of DAG to DAG rewrites
> (LLVM uses several internal representations:
> - generic optimizer uses LLVM IR, most of the work is done
> using this representation;
> - before instruction selection IR is converted to Selection DAG,
> some optimizations are applied at this stage,
> all such optimizations are a set of pattern replacements;
> - Selection DAG is converted to machine code, some optimizations
> are applied at the machine code level).
>
> Full pattern is described as follows:
>
> // fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A)
> // where y is has a single bit set.
> // A plaintext description would be, we can turn the SELECT_CC into an AND
> // when the condition can be materialized as an all-ones register. Any
> // single bit-test can be materialized as an all-ones register with
> // shift-left and shift-right-arith.
>
> For this particular test case the DAG is converted as follows:
>
> .---------------- lhs The meaning of this select_cc is:
> | .------- rhs `lhs == rhs ? true value : false value`
> | | .----- true value
> | | | .-- false value
> v v v v
> (select_cc seteq (and X 2) 0 0 -13)
> ^
> -> '---------------.
> (and (sra (sll X 62) 63) |
> -13) |
> |
> Before pattern is applied, it checks that second 'and' operand has
> only one bit set, (which is true for '2').
>
> The pattern itself generates logical shift left / arithmetic shift
> right pair, that ensures that result is either all ones (-1) or all
> zeros (0). Hence, applying 'and' to shifts result and false value
> generates a correct result.
>
Thanks for your detailed and invaluable explanation!
> In my opinion the approach taken by this patch is sub-optimal:
> - 512 iterations is too much;
> - this does not cover all code that could be generated by the above
> mentioned LLVM transformation
> (e.g. second 'and' operand could be 1 << 16).
>
> Instead, I suggest to make a special case for source or dst register
> of '&=' operation being in range [-1,0].
> Meaning that one of the '&=' operands is either:
> - all ones, in which case the counterpart is the result of the operation;
> - all zeros, in which case zero is the result of the operation;
> - derive MIN and MAX values based on above two observations.
>
Totally agree, I'll cook a new patch as you suggested.
> [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L5391
>
> Best regards,
> Eduard
Powered by blists - more mailing lists