netdev - Re: KASAN: use-after-free Read in ___bpf_prog

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <F4EB9C63-8ED6-4640-9BD4-4709F01589FD@gmail.com>
Date:   Mon, 16 Jan 2023 20:45:14 +0800
From:   Hao Sun <sunhao.th@...il.com>
To:     Yonghong Song <yhs@...a.com>
Cc:     bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>
Subject: Re: KASAN: use-after-free Read in ___bpf_prog_run



> On 12 Jan 2023, at 2:59 PM, Yonghong Song <yhs@...a.com> wrote:
> 
> 
> 
> On 1/9/23 5:21 AM, Hao Sun wrote:
>> Yonghong Song <yhs@...a.com> 于2022年12月18日周日 00:57写道：
>>> 
>>> 
>>> 
>>> On 12/16/22 10:54 PM, Hao Sun wrote:
>>>> 
>>>> 
>>>>> On 17 Dec 2022, at 1:07 PM, Yonghong Song <yhs@...a.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 12/14/22 11:49 PM, Hao Sun wrote:
>>>>>> Hi,
>>>>>> The following KASAN report can be triggered by loading and test
>>>>>> running this simple BPF prog with a random data/ctx:
>>>>>> 0: r0 = bpf_get_current_task_btf      ;
>>>>>> R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>>>>> 1: r0 = *(u32 *)(r0 +8192)       ;
>>>>>> R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>>>>> 2: exit
>>>>>> I've simplified the C reproducer but didn't find the root cause.
>>>>>> JIT was disabled, and the interpreter triggered UAF when executing
>>>>>> the load insn. A slab-out-of-bound read can also be triggered:
>>>>>> https://pastebin.com/raw/g9zXr8jU
>>>>>> This can be reproduced on:
>>>>>> HEAD commit: b148c8b9b926 selftests/bpf: Add few corner cases to test
>>>>>> padding handling of btf_dump
>>>>>> git tree: bpf-next
>>>>>> console log: https://pastebin.com/raw/1EUi9tJe
>>>>>> kernel config: https://pastebin.com/raw/rgY3AJDZ
>>>>>> C reproducer: https://pastebin.com/raw/cfVGuCBm
>>>>> 
>>>>> I I tried with your above kernel config and C reproducer and cannot reproduce the kasan issue you reported.
>>>>> 
>>>>> [root@...h-fb-vm1 bpf-next]# ./a.out
>>>>> func#0 @0
>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>> 0: (85) call bpf_get_current_task_btf#158     ; R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>>>> 1: (61) r0 = *(u32 *)(r0 +8192)       ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>>>> 2: (95) exit
>>>>> processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>>> 
>>>>> prog fd: 3
>>>>> [root@...h-fb-vm1 bpf-next]#
>>>>> 
>>>>> Your config indeed has kasan on.
>>>> 
>>>> Hi,
>>>> 
>>>> I can still reproduce this on a latest bpf-next build: 0e43662e61f25
>>>> (“tools/resolve_btfids: Use pkg-config to locate libelf”).
>>>> The simplified C reproducer sometime need to be run twice to trigger
>>>> the UAF. Also note that interpreter is required. Here is the original
>>>> C reproducer that loads and runs the BPF prog continuously for your
>>>> convenience:
>>>> https://pastebin.com/raw/WSJuNnVU
>>>> 
>>> 
>>> I still cannot reproduce with more than 10 runs. The config has jit off
>>> so it already uses interpreter. It has kasan on as well.
>>> # CONFIG_BPF_JIT is not set
>>> 
>>> Since you can reproduce it, I guess it would be great if you can
>>> continue to debug this.
>>> 
>> The load insn ‘r0 = *(u32*) (current + 8192)’ is OOB, because sizeof(task_struct)
>> is 7240 as shown in KASAN report. The issue is that struct task_struct is special,
>> its runtime size is actually smaller than it static type size. In X86:
>> task_struct->thread_struct->fpu->fpstate->union fpregs_state is
>> /*
>> * ...
>> * The size of the structure is determined by the largest
>> * member - which is the xsave area. The padding is there
>> * to ensure that statically-allocated task_structs (just
>> * the init_task today) have enough space.
>> */
>> union fpregs_state {
>> struct fregs_state fsave;
>> struct fxregs_state fxsave;
>> struct swregs_state soft;
>> struct xregs_state xsave;
>> u8 __padding[PAGE_SIZE];
>> };
>> In btf_struct_access(), the resolved size for task_struct is 10496, much bigger
>> than its runtime size, so the prog in reproducer passed the verifier and leads
>> to the oob. This can happen to all similar types, whose runtime size is smaller
>> than its static size.
>> Not sure how many similar cases are there, maybe special check to task_struct
>> is enough. Any hint on how this should be addressed?
> 
> This should a corner case, I am not aware of other allocations like this.
> 
> For a normal program, if the access chain looks
> like
> task_struct->thread_struct->fpu->fpstate->fpregs_state->{fsave,fxsave, soft, xsave},
> we should not hit this issue. So I think we don't need to address this
> issue in kernel. The test itself should filter this out.

Maybe I didn’t make my point clear. The issue here is that the runtime size
of task_struct is `arch_task_struct_size`, which equals to the following,
see fpu__init_task_struct_size():

sizeof(task_struct) - sizeof(fpregs_state) + fpu_kernel_cfg.default_size

However, the verifier validates task_struct access based on the ty info in
btf_vmlinux, which equals to sizeof(task_struct), much bigger than `arch_
task_struct_size` due to the `__padding[PAGE_SIZE]` in fpregs_state, this
leads to OOB access.

We should not allow progs that access the task_struct beyond `arch_task_
struct_size` to be loaded. So maybe we should add a fixup in `btf_parse_
vmlinux()` to assign the correct size to each type of this chain:
task_struct->thread_struct->fpu->fpstate->fpregs_state;
Or maybe we should fix this during generating BTF of vmlinux?