[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <103c41eb-cbc1-5af2-f158-9875adb03d6b@fb.com>
Date: Fri, 7 May 2021 11:32:05 -0700
From: Yonghong Song <yhs@...com>
To: Brendan Jackman <jackmanb@...gle.com>
CC: bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: Help with verifier failure
On 5/7/21 8:05 AM, Brendan Jackman wrote:
> On Thu, 22 Apr 2021 at 16:35, Yonghong Song <yhs@...com> wrote:
>>
>>
>>
>> On 4/22/21 6:55 AM, Brendan Jackman wrote:
>>> On Wed, 21 Apr 2021 at 18:59, Yonghong Song <yhs@...com> wrote:
>>>> On 4/21/21 8:06 AM, Yonghong Song wrote:
>>>>> On 4/21/21 5:23 AM, Brendan Jackman wrote:
>>>>> Thanks, Brendan. Looks at least the verifier failure is triggered
>>>>> by recent clang changes. I will take a look whether we could
>>>>> improve verifier for such a case and whether we could improve
>>>>> clang to avoid generate such codes the verifier doesn't like.
>>>>> Will get back to you once I had concrete analysis.
>>>>>
>>>>>>
>>>>>> This seems like it must be a common pitfall, any idea what we can do
>>>>>> to fix it
>>>>>> and avoid it in future? Am I misunderstanding the issue?
>>>>
>>>> First, for the example code you provided, I checked with llvm11, llvm12
>>>> and latest trunk llvm (llvm13-dev) and they all generated similar codes,
>>>> which may trigger verifier failure. Somehow you original code could be
>>>> different may only show up with a recent llvm, I guess.
>>>>
>>>> Checking llvm IR, the divergence between "w2 = w8" and "if r8 < 0x1000"
>>>> appears in insn scheduling phase related handling PHIs. Need to further
>>>> check whether it is possible to prevent the compiler from generating
>>>> such codes.
>>>>
>>>> The latest kernel already had the ability to track register equivalence.
>>>> However, the tracking is conservative for 32bit mov like "w2 = w8" as
>>>> you described in the above. if we have code like "r2 = r8; if r8 <
>>>> 0x1000 ...", we will be all good.
>>>>
>>>> The following hack fixed the issue,
>>>>
>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>> index 58730872f7e5..54f418fd6a4a 100644
>>>> --- a/kernel/bpf/verifier.c
>>>> +++ b/kernel/bpf/verifier.c
>>>> @@ -7728,12 +7728,20 @@ static int check_alu_op(struct bpf_verifier_env
>>>> *env, struct bpf_insn *insn)
>>>> insn->src_reg);
>>>> return -EACCES;
>>>> } else if (src_reg->type == SCALAR_VALUE) {
>>>> + /* If src_reg is in 32bit range,
>>>> there is
>>>> + * no need to reset the ID.
>>>> + */
>>>> + bool is_32bit_src =
>>>> src_reg->umax_value <= 0x7fffffff;
>>>> +
>>>> + if (is_32bit_src && !src_reg->id)
>>>> + src_reg->id = ++env->id_gen;
>>>> *dst_reg = *src_reg;
>>>> /* Make sure ID is cleared
>>>> otherwise
>>>> * dst_reg min/max could be
>>>> incorrectly
>>>> * propagated into src_reg by
>>>> find_equal_scalars()
>>>> */
>>>> - dst_reg->id = 0;
>>>> + if (!is_32bit_src)
>>>> + dst_reg->id = 0;
>>>> dst_reg->live |= REG_LIVE_WRITTEN;
>>>> dst_reg->subreg_def =
>>>> env->insn_idx + 1;
>>>> } else {
>>>>
>>>> Basically, for a 32bit mov insn like "w2 = w8", if we can ensure
>>>> that "w8" is 32bit and has no possibility that upper 32bit is set
>>>> for r8, we can declare them equivalent. This fixed your issue.
>
> I just got around to looking into this - spent some time reading and
> realised it's simpler than I thought :) I also double checked that it
> fixes the test with my current Clang too.
>
> Beyond cleaning up and putting it into a patch, did you have anything
> in particular in mind when you called this a "hack"?
>
> Do I understand correctly that in this code we only need to check
> umax_value, because it anyway gets folded into the other bounds fields
> during adjust_min_max_reg_vals?
If the umax_value is less than or equal to INT_MAX, if all *_value's are
consistent in the register state, yes, it will be sufficient to
declare the reg is indeed holding a 32bit value in a 64bit register.
I mentioned it as a "hack" as I did not go through all the reg
range refining before/after this piece of codes. Since you have
looked at it and it seems fine. I would suggest you can just
with my patch above plus your test and submit it to the mailing
list for review.
>
> It seems like the next rung on the "ladder" of solution completeness
> here would be quite a big step up, something like a more comprehensive
> representation of register relationships (instead of just "these regs
> have the same value" vs. "these regs have no relationship"), which I
> guess would be more extreme than necessary right now.
We have to weigh between verifier complexity and whether it is general
enough for compilation transformation. Yes, if you have such use cases,
please share and we can discuss how to address them.
Powered by blists - more mailing lists