[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <933A445C-725E-4BC2-8860-2D0A92C34C58@gmail.com>
Date: Mon, 9 Jan 2023 21:21:22 +0800
From: Hao Sun <sunhao.th@...il.com>
To: Yonghong Song <yhs@...a.com>
Cc: bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>,
Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
netdev <netdev@...r.kernel.org>
Subject: Re: KASAN: use-after-free Read in ___bpf_prog_run
Yonghong Song <yhs@...a.com> 于2022年12月18日周日 00:57写道:
>
>
>
> On 12/16/22 10:54 PM, Hao Sun wrote:
>>
>>
>>> On 17 Dec 2022, at 1:07 PM, Yonghong Song <yhs@...a.com> wrote:
>>>
>>>
>>>
>>> On 12/14/22 11:49 PM, Hao Sun wrote:
>>>> Hi,
>>>> The following KASAN report can be triggered by loading and test
>>>> running this simple BPF prog with a random data/ctx:
>>>> 0: r0 = bpf_get_current_task_btf ;
>>>> R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>>> 1: r0 = *(u32 *)(r0 +8192) ;
>>>> R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>>> 2: exit
>>>> I've simplified the C reproducer but didn't find the root cause.
>>>> JIT was disabled, and the interpreter triggered UAF when executing
>>>> the load insn. A slab-out-of-bound read can also be triggered:
>>>> https://pastebin.com/raw/g9zXr8jU
>>>> This can be reproduced on:
>>>> HEAD commit: b148c8b9b926 selftests/bpf: Add few corner cases to test
>>>> padding handling of btf_dump
>>>> git tree: bpf-next
>>>> console log: https://pastebin.com/raw/1EUi9tJe
>>>> kernel config: https://pastebin.com/raw/rgY3AJDZ
>>>> C reproducer: https://pastebin.com/raw/cfVGuCBm
>>>
>>> I I tried with your above kernel config and C reproducer and cannot reproduce the kasan issue you reported.
>>>
>>> [root@...h-fb-vm1 bpf-next]# ./a.out
>>> func#0 @0
>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>> 0: (85) call bpf_get_current_task_btf#158 ; R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>> 1: (61) r0 = *(u32 *)(r0 +8192) ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>> 2: (95) exit
>>> processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>
>>> prog fd: 3
>>> [root@...h-fb-vm1 bpf-next]#
>>>
>>> Your config indeed has kasan on.
>>
>> Hi,
>>
>> I can still reproduce this on a latest bpf-next build: 0e43662e61f25
>> (“tools/resolve_btfids: Use pkg-config to locate libelf”).
>> The simplified C reproducer sometime need to be run twice to trigger
>> the UAF. Also note that interpreter is required. Here is the original
>> C reproducer that loads and runs the BPF prog continuously for your
>> convenience:
>> https://pastebin.com/raw/WSJuNnVU
>>
>
> I still cannot reproduce with more than 10 runs. The config has jit off
> so it already uses interpreter. It has kasan on as well.
> # CONFIG_BPF_JIT is not set
>
> Since you can reproduce it, I guess it would be great if you can
> continue to debug this.
>
The load insn ‘r0 = *(u32*) (current + 8192)’ is OOB, because sizeof(task_struct)
is 7240 as shown in KASAN report. The issue is that struct task_struct is special,
its runtime size is actually smaller than it static type size. In X86:
task_struct->thread_struct->fpu->fpstate->union fpregs_state is
/*
* ...
* The size of the structure is determined by the largest
* member - which is the xsave area. The padding is there
* to ensure that statically-allocated task_structs (just
* the init_task today) have enough space.
*/
union fpregs_state {
struct fregs_state fsave;
struct fxregs_state fxsave;
struct swregs_state soft;
struct xregs_state xsave;
u8 __padding[PAGE_SIZE];
};
In btf_struct_access(), the resolved size for task_struct is 10496, much bigger
than its runtime size, so the prog in reproducer passed the verifier and leads
to the oob. This can happen to all similar types, whose runtime size is smaller
than its static size.
Not sure how many similar cases are there, maybe special check to task_struct
is enough. Any hint on how this should be addressed?
Powered by blists - more mailing lists