lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250315082251.32679-1-enjuk@amazon.com>
Date: Sat, 15 Mar 2025 17:21:55 +0900
From: Kohei Enju <enjuk@...zon.com>
To: <eddyz87@...il.com>
CC: <andrii@...nel.org>, <ast@...nel.org>, <bpf@...r.kernel.org>,
	<daniel@...earbox.net>, <enjuk@...zon.com>, <haoluo@...gle.com>,
	<iii@...ux.ibm.com>, <john.fastabend@...il.com>, <jolsa@...nel.org>,
	<kohei.enju@...il.com>, <kpsingh@...nel.org>, <kuniyu@...zon.com>,
	<linux-kernel@...r.kernel.org>, <martin.lau@...ux.dev>, <sdf@...ichev.me>,
	<song@...nel.org>, <syzbot+a5964227adc0f904549c@...kaller.appspotmail.com>,
	<yepeilin@...gle.com>, <yonghong.song@...ux.dev>
Subject: Re: [PATCH bpf-next v1] bpf: Fix out-of-bounds read in check_atomic_load/store()

>> syzbot reported the following splat [0].
>> 
>> In check_atomic_load/store(), register validity is not checked before
>> atomic_ptr_type_ok(). This causes the out-of-bounds read in is_ctx_reg()
>> called from atomic_ptr_type_ok() when the register number is MAX_BPF_REG
>> or greater.
>> 
>> Add check_reg_arg() before atomic_ptr_type_ok(), and return early when
>> the register is invalid.
>> 
>> [0]
>>  BUG: KASAN: slab-out-of-bounds in is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
>>  BUG: KASAN: slab-out-of-bounds in atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
>>  Read of size 4 at addr ffff888141b0d690 by task syz-executor143/5842
>> 
>>  CPU: 1 UID: 0 PID: 5842 Comm: syz-executor143 Not tainted 6.14.0-rc3-syzkaller-gf28214603dc6 #0
>>  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
>>  Call Trace:
>>   <TASK>
>>   __dump_stack lib/dump_stack.c:94 [inline]
>>   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
>>   print_address_description mm/kasan/report.c:408 [inline]
>>   print_report+0x16e/0x5b0 mm/kasan/report.c:521
>>   kasan_report+0x143/0x180 mm/kasan/report.c:634
>>   is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
>>   atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
>>   check_atomic_store kernel/bpf/verifier.c:7804 [inline]
>>   check_atomic kernel/bpf/verifier.c:7841 [inline]
>>   do_check+0x89dd/0xedd0 kernel/bpf/verifier.c:19334
>>   do_check_common+0x1678/0x2080 kernel/bpf/verifier.c:22600
>>   do_check_main kernel/bpf/verifier.c:22691 [inline]
>>   bpf_check+0x165c8/0x1cca0 kernel/bpf/verifier.c:23821
>>   bpf_prog_load+0x1664/0x20e0 kernel/bpf/syscall.c:2967
>>   __sys_bpf+0x4ea/0x820 kernel/bpf/syscall.c:5811
>>   __do_sys_bpf kernel/bpf/syscall.c:5918 [inline]
>>   __se_sys_bpf kernel/bpf/syscall.c:5916 [inline]
>>   __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5916
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>  RIP: 0033:0x7fa3ac86bab9
>>  Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>>  RSP: 002b:00007ffe50fff5f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
>>  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa3ac86bab9
>>  RDX: 0000000000000094 RSI: 00004000000009c0 RDI: 0000000000000005
>>  RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000006
>>  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>>  R13: 431bde82d7b634db R14: 0000000000000001 R15: 0000000000000001
>>   </TASK>
>> 
>>  Allocated by task 5842:
>>   kasan_save_stack mm/kasan/common.c:47 [inline]
>>   kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>   __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
>>   kasan_kmalloc include/linux/kasan.h:260 [inline]
>>   __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4325
>>   kmalloc_noprof include/linux/slab.h:901 [inline]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline]
>>   do_check_common+0x1ec/0x2080 kernel/bpf/verifier.c:22499
>>   do_check_main kernel/bpf/verifier.c:22691 [inline]
>>   bpf_check+0x165c8/0x1cca0 kernel/bpf/verifier.c:23821
>>   bpf_prog_load+0x1664/0x20e0 kernel/bpf/syscall.c:2967
>>   __sys_bpf+0x4ea/0x820 kernel/bpf/syscall.c:5811
>>   __do_sys_bpf kernel/bpf/syscall.c:5918 [inline]
>>   __se_sys_bpf kernel/bpf/syscall.c:5916 [inline]
>>   __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5916
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> 
>>  The buggy address belongs to the object at ffff888141b0d000
>>   which belongs to the cache kmalloc-2k of size 2048
>>  The buggy address is located 312 bytes to the right of
>>   allocated 1368-byte region [ffff888141b0d000, ffff888141b0d558)
>> 
>>  The buggy address belongs to the physical page:
>>  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x141b08
>>  head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>>  flags: 0x57ff00000000040(head|node=1|zone=2|lastcpupid=0x7ff)
>>  page_type: f5(slab)
>>  raw: 057ff00000000040 ffff88801b042000 dead000000000100 dead000000000122
>>  raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
>>  head: 057ff00000000040 ffff88801b042000 dead000000000100 dead000000000122
>>  head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
>>  head: 057ff00000000003 ffffea000506c201 ffffffffffffffff 0000000000000000
>>  head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>>  page dumped because: kasan: bad access detected
>>  page_owner tracks the page as allocated
>>  page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 1, tgid 1 (swapper/0), ts 8909973200, free_ts 0
>>   set_page_owner include/linux/page_owner.h:32 [inline]
>>   post_alloc_hook+0x1f4/0x240 mm/page_alloc.c:1585
>>   prep_new_page mm/page_alloc.c:1593 [inline]
>>   get_page_from_freelist+0x3a8c/0x3c20 mm/page_alloc.c:3538
>>   __alloc_frozen_pages_noprof+0x264/0x580 mm/page_alloc.c:4805
>>   alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
>>   alloc_slab_page mm/slub.c:2423 [inline]
>>   allocate_slab+0x8f/0x3a0 mm/slub.c:2587
>>   new_slab mm/slub.c:2640 [inline]
>>   ___slab_alloc+0xc27/0x14a0 mm/slub.c:3826
>>   __slab_alloc+0x58/0xa0 mm/slub.c:3916
>>   __slab_alloc_node mm/slub.c:3991 [inline]
>>   slab_alloc_node mm/slub.c:4152 [inline]
>>   __kmalloc_cache_noprof+0x27b/0x390 mm/slub.c:4320
>>   kmalloc_noprof include/linux/slab.h:901 [inline]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline]
>>   virtio_pci_probe+0x54/0x340 drivers/virtio/virtio_pci_common.c:689
>>   local_pci_probe drivers/pci/pci-driver.c:324 [inline]
>>   pci_call_probe drivers/pci/pci-driver.c:392 [inline]
>>   __pci_device_probe drivers/pci/pci-driver.c:417 [inline]
>>   pci_device_probe+0x6c5/0xa10 drivers/pci/pci-driver.c:451
>>   really_probe+0x2b9/0xad0 drivers/base/dd.c:658
>>   __driver_probe_device+0x1a2/0x390 drivers/base/dd.c:800
>>   driver_probe_device+0x50/0x430 drivers/base/dd.c:830
>>   __driver_attach+0x45f/0x710 drivers/base/dd.c:1216
>>   bus_for_each_dev+0x239/0x2b0 drivers/base/bus.c:370
>>   bus_add_driver+0x346/0x670 drivers/base/bus.c:678
>>  page_owner free stack trace missing
>> 
>>  Memory state around the buggy address:
>>   ffff888141b0d580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>   ffff888141b0d600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  >ffff888141b0d680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>                           ^
>>   ffff888141b0d700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>   ffff888141b0d780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> 
>> Reported-by: syzbot+a5964227adc0f904549c@...kaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c
>> Tested-by: syzbot+a5964227adc0f904549c@...kaller.appspotmail.com
>> Fixes: e24bbad29a8d ("bpf: Introduce load-acquire and store-release instructions")
>> Signed-off-by: Kohei Enju <enjuk@...zon.com>
>> ---
>
>I wonder if we have test cases for malformed instructions.
>Maybe add one since this issue was hit?

Thank you for the suggestion. 
I agree that having a test for this specific issue would be valuable. 
I'll work on it.

>
>>  kernel/bpf/verifier.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>> 
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 3303a3605ee8..6481604ab612 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -7788,6 +7788,12 @@ static int check_atomic_rmw(struct bpf_verifier_env *env,
>>  static int check_atomic_load(struct bpf_verifier_env *env,
>>  			     struct bpf_insn *insn)
>>  {
>> +	int err;
>> +
>> +	err = check_reg_arg(env, insn->src_reg, SRC_OP);
>> +	if (err)
>> +		return err;
>> +
>
>I agree with these changes, however, both check_load_mem() and
>check_store_reg() already do check_reg_arg() first thing.
>Maybe just swap the atomic_ptr_type_ok() and check_load_mem()?

You're absolutely right. The additional check_reg_arg() introduces some 
redundancy since check_load_mem() and check_store_reg() do that.
I've revised the patch to simply swap the order and syzbot didn't trigger 
the issue in this context.
    https://lore.kernel.org/all/20250315055941.10487-2-enjuk@amazon.com/

However, the change in order affects results of existing test cases 
because messages are changed by this swapping.[1]
In this case, we need to either modify these existing test cases or 
reconsider the logic of check_atomic_load/store().
I'm not sure which approach would be preferable in this situation.

[1] https://patchwork.kernel.org/project/netdevbpf/patch/20250315055941.10487-2-enjuk@amazon.com/
 First test_progs failure (test_progs-x86_64-llvm-18):
 #504 verifier_load_acquire
 tester_init:PASS:tester_log_buf 0 nsec
 process_subtest:PASS:obj_open_mem 0 nsec
 process_subtest:PASS:specs_alloc 0 nsec
 #504/17 verifier_load_acquire/load-acquire from pkt pointer
 run_subtest:PASS:obj_open_mem 0 nsec
 libbpf: prog 'load_acquire_from_pkt_pointer': BPF program load failed: -EACCES
 libbpf: prog 'load_acquire_from_pkt_pointer': failed to load: -EACCES
 libbpf: failed to load object 'verifier_load_acquire'
 run_subtest:PASS:unexpected_load_success 0 nsec
 validate_msgs:FAIL:754 expect_msg
 VERIFIER LOG:
 =============
 Global function load_acquire_from_pkt_pointer() doesn't return scalar. Only those are supported.
 0: R1=ctx() R10=fp0
 ; asm volatile ( @ verifier_load_acquire.c:140
 0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx() R2_w=pkt(r=0)
 1: (d3) r0 = load_acquire((u8 *)(r2 +0))
 invalid access to packet, off=0 size=1, R2(id=0,off=0,r=0)
 R2 offset is outside of the packet
 processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
 =============
 EXPECTED   SUBSTR: 'BPF_ATOMIC loads from R2 pkt is not allowed'
 ... (snip)
 #541/25 verifier_store_release/store-release to sock pointer
 run_subtest:PASS:obj_open_mem 0 nsec
 libbpf: prog 'store_release_to_sock_pointer': BPF program load failed: -EACCES
 libbpf: prog 'store_release_to_sock_pointer': failed to load: -EACCES
 libbpf: failed to load object 'verifier_store_release'
 run_subtest:PASS:unexpected_load_success 0 nsec
 validate_msgs:FAIL:754 expect_msg
 VERIFIER LOG:
 =============
 Global function store_release_to_sock_pointer() doesn't return scalar. Only those are supported.
 0: R1=ctx() R10=fp0
 ; asm volatile ( @ verifier_store_release.c:191
 0: (b4) w0 = 0                        ; R0_w=0
 1: (79) r2 = *(u64 *)(r1 +40)         ; R1=ctx() R2_w=sock()
 2: (d3) store_release((u8 *)(r2 +0), r0)
 R2 cannot write into sock
 processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
 =============
 EXPECTED   SUBSTR: 'BPF_ATOMIC stores into R2 sock is not allowed'

>
>>  	if (!atomic_ptr_type_ok(env, insn->src_reg, insn)) {
>>  		verbose(env, "BPF_ATOMIC loads from R%d %s is not allowed\n",
>>  			insn->src_reg,
>> @@ -7801,6 +7807,12 @@ static int check_atomic_load(struct bpf_verifier_env *env,
>>  static int check_atomic_store(struct bpf_verifier_env *env,
>>  			      struct bpf_insn *insn)
>>  {
>> +	int err;
>> +
>> +	err = check_reg_arg(env, insn->dst_reg, SRC_OP);
>> +	if (err)
>> +		return err;
>> +
>>  	if (!atomic_ptr_type_ok(env, insn->dst_reg, insn)) {
>>  		verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n",
>>  			insn->dst_reg,

Regards,
Kohei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ