linux-kernel - Re: [PATCH v2] bpf: fix stackmap overflow check in __bpf_get

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a0e172e9-e4d3-427f-b237-ba8f6b3772f4@arnaud-lcm.com>
Date: Tue, 5 Aug 2025 21:49:48 +0100
From: Arnaud Lecomte <contact@...aud-lcm.com>
To: Yonghong Song <yonghong.song@...ux.dev>, song@...nel.org,
 jolsa@...nel.org, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
 martin.lau@...ux.dev, eddyz87@...il.com, john.fastabend@...il.com,
 kpsingh@...nel.org, sdf@...ichev.me, haoluo@...gle.com
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzkaller-bugs@...glegroups.com,
 syzbot+c9b724fbb41cf2538b7b@...kaller.appspotmail.com
Subject: Re: [PATCH v2] bpf: fix stackmap overflow check in
 __bpf_get_stackid()

Hi,
I gave it several tries and I can't find a nice to do see properly.
The main challenge is to find a way to detect memory corruption. I 
wanted to place a canary value
  by tweaking the map size but we don't have a way from a BPF program 
perspective to access to the size
of a stack_map_bucket. If we decide to do this computation manually, we 
would end-up with maintainability
  issues:
#include "vmlinux.h"
#include "bpf/bpf_helpers.h"

#define MAX_STACK_DEPTH 32
#define CANARY_VALUE 0xBADCAFE

/* Calculate size based on known layout:
  * - fnode: sizeof(void*)
  * - hash: 4 bytes
  * - nr: 4 bytes
  * - data: MAX_STACK_DEPTH * 8 bytes
  * - canary: 8 bytes
  */
#define VALUE_SIZE (sizeof(void*) + 4 + 4 + (MAX_STACK_DEPTH * 8) + 8)

struct {
     __uint(type, BPF_MAP_TYPE_STACK_TRACE);
     __uint(max_entries, 1);
     __uint(value_size, VALUE_SIZE);
     __uint(key_size, sizeof(u32));
} stackmap SEC(".maps");

static __attribute__((noinline)) void recursive_helper(int depth) {
     if (depth <= 0) return;
     asm volatile("" ::: "memory");
     recursive_helper(depth - 1);
}

SEC("kprobe/do_sys_open")
int test_stack_overflow(void *ctx) {
     u32 key = 0;
     u64 *stack = bpf_map_lookup_elem(&stackmap, &key);
     if (!stack) return 0;

     stack[MAX_STACK_DEPTH] = CANARY_VALUE;

     /* Force minimum stack depth */
     recursive_helper(MAX_STACK_DEPTH + 10);

     (void)bpf_get_stackid(ctx, &stackmap, 0);
     return 0;
}

char _license[] SEC("license") = "GPL";

On 01/08/2025 19:16, Lecomte, Arnaud wrote:
> Well, it turns out it is less straightforward than it looked like to 
> detect the memory corruption
>  without KASAN. I am currently in holidays for the next 3 days so I've 
> limited access to a
> computer. I should be able to sort this out on monday.
>
> Thanks,
> Arnaud
>
> On 30/07/2025 08:10, Arnaud Lecomte wrote:
>> On 29/07/2025 23:45, Yonghong Song wrote:
>>>
>>>
>>> On 7/29/25 9:56 AM, Arnaud Lecomte wrote:
>>>> Syzkaller reported a KASAN slab-out-of-bounds write in 
>>>> __bpf_get_stackid()
>>>> when copying stack trace data. The issue occurs when the perf trace
>>>>   contains more stack entries than the stack map bucket can hold,
>>>>   leading to an out-of-bounds write in the bucket's data array.
>>>> For build_id mode, we use sizeof(struct bpf_stack_build_id)
>>>>   to determine capacity, and for normal mode we use sizeof(u64).
>>>>
>>>> Reported-by: syzbot+c9b724fbb41cf2538b7b@...kaller.appspotmail.com
>>>> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b
>>>> Tested-by: syzbot+c9b724fbb41cf2538b7b@...kaller.appspotmail.com
>>>> Signed-off-by: Arnaud Lecomte <contact@...aud-lcm.com>
>>>
>>> Could you add a selftest? This way folks can easily find out what is
>>> the problem and why this fix solves the issue correctly.
>>>
>> Sure, will be done after work
>> Thanks,
>> Arnaud
>>>> ---
>>>> Changes in v2:
>>>>   - Use utilty stack_map_data_size to compute map stack map size
>>>> ---
>>>>   kernel/bpf/stackmap.c | 8 +++++++-
>>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>>>> index 3615c06b7dfa..6f225d477f07 100644
>>>> --- a/kernel/bpf/stackmap.c
>>>> +++ b/kernel/bpf/stackmap.c
>>>> @@ -230,7 +230,7 @@ static long __bpf_get_stackid(struct bpf_map *map,
>>>>       struct bpf_stack_map *smap = container_of(map, struct 
>>>> bpf_stack_map, map);
>>>>       struct stack_map_bucket *bucket, *new_bucket, *old_bucket;
>>>>       u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
>>>> -    u32 hash, id, trace_nr, trace_len, i;
>>>> +    u32 hash, id, trace_nr, trace_len, i, max_depth;
>>>>       bool user = flags & BPF_F_USER_STACK;
>>>>       u64 *ips;
>>>>       bool hash_matches;
>>>> @@ -241,6 +241,12 @@ static long __bpf_get_stackid(struct bpf_map 
>>>> *map,
>>>>         trace_nr = trace->nr - skip;
>>>>       trace_len = trace_nr * sizeof(u64);
>>>> +
>>>> +    /* Clamp the trace to max allowed depth */
>>>> +    max_depth = smap->map.value_size / stack_map_data_size(map);
>>>> +    if (trace_nr > max_depth)
>>>> +        trace_nr = max_depth;
>>>> +
>>>>       ips = trace->ip + skip;
>>>>       hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0);
>>>>       id = hash & (smap->n_buckets - 1);
>>>
>>>