linux-kernel - Re: [PATCH bpf-next] Detect jumping to reserved code during check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79dd71a5-446d-9b05-7d37-40e49bbf04ae@iogearbox.net>
Date:   Wed, 11 Oct 2023 16:50:00 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Hao Sun <sunhao.th@...il.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>,
        Yonghong Song <yonghong.song@...ux.dev>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next] Detect jumping to reserved code during
 check_cfg()

On 10/11/23 8:46 AM, Hao Sun wrote:
> On Wed, Oct 11, 2023 at 4:42 AM Andrii Nakryiko
> <andrii.nakryiko@...il.com> wrote:
>> On Tue, Oct 10, 2023 at 1:33 AM Daniel Borkmann <daniel@...earbox.net> wrote:
>>> On 10/10/23 9:02 AM, John Fastabend wrote:
>>>> Hao Sun wrote:
>>>>> Currently, we don't check if the branch-taken of a jump is reserved code of
>>>>> ld_imm64. Instead, such a issue is captured in check_ld_imm(). The verifier
>>>>> gives the following log in such case:
>>>>>
>>>>> func#0 @0
>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>> 0: (18) r4 = 0xffff888103436000       ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
>>>>> 2: (18) r1 = 0x1d                     ; R1_w=29
>>>>> 4: (55) if r4 != 0x0 goto pc+4        ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
>>>>> 5: (1c) w1 -= w1                      ; R1_w=0
>>>>> 6: (18) r5 = 0x32                     ; R5_w=50
>>>>> 8: (56) if w5 != 0xfffffff4 goto pc-2
>>>>> mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1
>>>>> mark_precise: frame0: regs=r5 stack= before 6: (18) r5 = 0x32
>>>>> 7: R5_w=50
>>>>> 7: BUG_ld_00
>>>>> invalid BPF_LD_IMM insn
>>>>>
>>>>> Here the verifier rejects the program because it thinks insn at 7 is an
>>>>> invalid BPF_LD_IMM, but such a error log is not accurate since the issue
>>>>> is jumping to reserved code not because the program contains invalid insn.
>>>>> Therefore, make the verifier check the jump target during check_cfg(). For
>>>>> the same program, the verifier reports the following log:
>>>>
>>>> I think we at least would want a test case for this. Also how did you create
>>>> this case? Is it just something you did manually and noticed a strange error?
>>>
>>> Curious as well.
>>>
>>> We do have test cases which try to jump into the middle of a double insn as can
>>> be seen that this patch breaks BPF CI with regards to log mismatch below (which
>>> still needs to be adapted, too). Either way, it probably doesn't hurt to also add
>>> the above snippet as a test.
>>>
>>> Hao, as I understand, the patch here is an usability improvement (not a fix per se)
>>> where we reject such cases earlier during cfg check rather than at a later point
>>> where we validate ld_imm instruction. Or are there cases you found which were not
>>> yet captured via current check_ld_imm()?
>>>
>>> test_verifier failure log :
>>>
>>>     #458/u test1 ld_imm64 FAIL
>>>     Unexpected verifier log!
>>>     EXP: R1 pointer comparison
>>>     RES:
>>>     FAIL
>>>     Unexpected error message!
>>>          EXP: R1 pointer comparison
>>>          RES: jump to reserved code from insn 0 to 2
>>>     verification time 22 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>
>>>     jump to reserved code from insn 0 to 2
>>>     verification time 22 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>     #458/p test1 ld_imm64 FAIL
>>>     Unexpected verifier log!
>>>     EXP: invalid BPF_LD_IMM insn
>>>     RES:
>>>     FAIL
>>>     Unexpected error message!
>>>          EXP: invalid BPF_LD_IMM insn
>>>          RES: jump to reserved code from insn 0 to 2
>>>     verification time 9 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>
>>>     jump to reserved code from insn 0 to 2
>>>     verification time 9 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>     #459/u test2 ld_imm64 FAIL
>>>     Unexpected verifier log!
>>>     EXP: R1 pointer comparison
>>>     RES:
>>>     FAIL
>>>     Unexpected error message!
>>>          EXP: R1 pointer comparison
>>>          RES: jump to reserved code from insn 0 to 2
>>>     verification time 11 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>
>>>     jump to reserved code from insn 0 to 2
>>>     verification time 11 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>     #459/p test2 ld_imm64 FAIL
>>>     Unexpected verifier log!
>>>     EXP: invalid BPF_LD_IMM insn
>>>     RES:
>>>     FAIL
>>>     Unexpected error message!
>>>          EXP: invalid BPF_LD_IMM insn
>>>          RES: jump to reserved code from insn 0 to 2
>>>     verification time 8 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>
>>>     jump to reserved code from insn 0 to 2
>>>     verification time 8 usec
>>>     stack depth 0
>>>     processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>     #460/u test3 ld_imm64 OK
>>>
>>>>> func#0 @0
>>>>> jump to reserved code from insn 8 to 7
>>>>>
>>>>> Signed-off-by: Hao Sun <sunhao.th@...il.com>
>>>
>>> nit: This needs to be before the "---" line.
>>>
>>>>> ---
>>>>>    kernel/bpf/verifier.c | 7 +++++++
>>>>>    1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>>> index eed7350e15f4..725ac0b464cf 100644
>>>>> --- a/kernel/bpf/verifier.c
>>>>> +++ b/kernel/bpf/verifier.c
>>>>> @@ -14980,6 +14980,7 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
>>>>>    {
>>>>>       int *insn_stack = env->cfg.insn_stack;
>>>>>       int *insn_state = env->cfg.insn_state;
>>>>> +    struct bpf_insn *insns = env->prog->insnsi;
>>>>>
>>>>>       if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
>>>>>               return DONE_EXPLORING;
>>>>> @@ -14993,6 +14994,12 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
>>>>>               return -EINVAL;
>>>>>       }
>>>>>
>>>>> +    if (e == BRANCH && insns[w].code == 0) {
>>>>> +            verbose_linfo(env, t, "%d", t);
>>>>> +            verbose(env, "jump to reserved code from insn %d to %d\n", t, w);
>>>>> +            return -EINVAL;
>>>>> +    }
>>>
>>> Other than that, lgtm.
>>
>> We do rely quite a lot on verifier not complaining eagerly about some
>> potentially invalid instructions if it's provable that some portion of
>> the code won't ever be reached (think using .rodata variables for
>> feature gating, poisoning intructions due to failed CO-RE relocation,
>> which libbpf does actively, except it's using a call to non-existing
>> helper). As such, check_cfg() is a wrong place to do such validity
>> checks because some of the branches might never be run and validated
>> in practice.
> 
> Don't really agree. Jump to the middle of ld_imm64 is just like jumping
> out of bounds, both break the CFG integrity immediately. For those
> apparently incorrect  jumps, rejecting early makes everything simple;
> otherwise, we probably need some rewrite in the end.

Could you elaborate on the 'breaking CFG integrity immediately'? This was
what I was trying to gather earlier with log improvement vs actual fix.

Do you mean /potentially/ breaking CFG integrity, if, say, we had a double
insn jump in future and there is a back-jump to the 2nd part of the insn?

> Also, as you mentioned, libbpf relies on non-existing helpers, not jump
> to the middle of ld_imm64. It seems better and easier to not leave this
> hole.

Thanks,
Daniel