[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26f04a6b-4248-6898-8612-793e02712017@huaweicloud.com>
Date: Wed, 23 Oct 2024 10:03:44 +0800
From: Hou Tao <houtao@...weicloud.com>
To: Byeonguk Jeong <jungbu2855@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Yonghong Song <yonghong.song@...ux.dev>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bpf: Fix out-of-bounds write in trie_get_next_key()
On 10/22/2024 9:45 AM, Byeonguk Jeong wrote:
> trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
> while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
> full paths from the root to leaves. For example, consider a trie with
> max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
> 0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
> .prefixlen = 8 make 9 nodes be written on the node stack with size 8.
>
> Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map")
> Signed-off-by: Byeonguk Jeong <jungbu2855@...il.com>
> ---
Tested-by: Hou Tao <houtao1@...wei.com>
Without the fix, there will be KASAN report as show below when dumping
all keys in the lpm-trie through bpf_map_get_next_key().
However, I have a dumb question: does it make sense to reject the
element with prefixlen = 0 ? Because I can't think of a use case where a
zero-length prefix will be useful.
==================================================================
BUG: KASAN: slab-out-of-bounds in trie_get_next_key+0x133/0x530
Write of size 8 at addr ffff8881076c2fc0 by task test_lpm_trie.b/446
CPU: 0 UID: 0 PID: 446 Comm: test_lpm_trie.b Not tainted 6.11.0+ #52
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
Call Trace:
<TASK>
dump_stack_lvl+0x6e/0xb0
print_report+0xce/0x610
? trie_get_next_key+0x133/0x530
? kasan_complete_mode_report_info+0x3c/0x200
? trie_get_next_key+0x133/0x530
kasan_report+0x9c/0xd0
? trie_get_next_key+0x133/0x530
__asan_store8+0x81/0xb0
trie_get_next_key+0x133/0x530
__sys_bpf+0x1b03/0x3140
? __pfx___sys_bpf+0x10/0x10
? __pfx_vfs_write+0x10/0x10
? find_held_lock+0x8e/0xb0
? ksys_write+0xee/0x180
? syscall_exit_to_user_mode+0xb3/0x220
? mark_held_locks+0x28/0x90
? mark_held_locks+0x28/0x90
__x64_sys_bpf+0x45/0x60
x64_sys_call+0x1b2a/0x20d0
do_syscall_64+0x5d/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f9c5e9c9c5d
......
</TASK>
Allocated by task 446:
kasan_save_stack+0x28/0x50
kasan_save_track+0x14/0x30
kasan_save_alloc_info+0x36/0x40
__kasan_kmalloc+0x84/0xa0
__kmalloc_noprof+0x214/0x540
trie_get_next_key+0xa7/0x530
__sys_bpf+0x1b03/0x3140
__x64_sys_bpf+0x45/0x60
x64_sys_call+0x1b2a/0x20d0
do_syscall_64+0x5d/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff8881076c2f80
which belongs to the cache kmalloc-rnd-09-64 of size 64
The buggy address is located 0 bytes to the right of
allocated 64-byte region [ffff8881076c2f80, ffff8881076c2fc0)
> kernel/bpf/lpm_trie.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 0218a5132ab5..9b60eda0f727 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -655,7 +655,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key)
> if (!key || key->prefixlen > trie->max_prefixlen)
> goto find_leftmost;
>
> - node_stack = kmalloc_array(trie->max_prefixlen,
> + node_stack = kmalloc_array(trie->max_prefixlen + 1,
> sizeof(struct lpm_trie_node *),
> GFP_ATOMIC | __GFP_NOWARN);
> if (!node_stack)
Powered by blists - more mailing lists