[<prev] [next>] [day] [month] [year] [list]
Message-ID: <tencent_96B2406D35BC34CC8A5C0D6E0F7B13662007@qq.com>
Date: Tue, 4 Mar 2025 07:26:41 -0500
From: "ffhgfv" <744439878@...com>
To: "linux-kernel" <linux-kernel@...r.kernel.org>, "linux-fsdevel" <linux-fsdevel@...r.kernel.org>
Cc: "linkinjeon" <linkinjeon@...nel.org>, "sj1557.seo" <sj1557.seo@...sung.com>, "yuezhang.mo" <yuezhang.mo@...y.com>
Subject: kernel bug found in exfat and suggestions for fixing it
Hello, I found a bug titled "KASAN: vmalloc-out-of-bounds Write in vfree_atomic " with modified syzkaller in the lasted upstream related to exfat file system.
If you fix this issue, please add the following tag to the commit: Reported-by: Jianzhou Zhao<xnxc22xnxc22@...com>;, xingwei lee <xrivendell7@...il.com>; Zhizhuo Tang <strforexctzzchange@...mail.com>
------------[ cut here ]------------
TITLE: BUG: KASAN: vmalloc-out-of-bounds in llist_add_batch
==================================================================
BUG: KASAN: vmalloc-out-of-bounds in llist_add_batch+0x14f/0x170 lib/llist.c:32
Write of size 8 at addr ffffc90006531000 by task syz.0.183/13735
CPU: 1 UID: 0 PID: 13735 Comm: syz.0.183 Not tainted 6.14.0-rc5-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<irq>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0xc1/0x630 mm/kasan/report.c:521
kasan_report+0xbd/0xf0 mm/kasan/report.c:634
llist_add_batch+0x14f/0x170 lib/llist.c:32
llist_add include/linux/llist.h:248 [inline]
vfree_atomic+0x5e/0xe0 mm/vmalloc.c:3326
vfree+0x7c1/0x940 mm/vmalloc.c:3353
kvfree+0x32/0x50 mm/util.c:703
delayed_free+0x49/0xb0 fs/exfat/super.c:809
rcu_do_batch kernel/rcu/tree.c:2546 [inline]
rcu_core+0x79f/0x14f0 kernel/rcu/tree.c:2802
handle_softirqs+0x1d1/0x870 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0x109/0x170 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa8/0xc0 arch/x86/kernel/apic/apic.c:1049
</irq>
<task>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:lock_acquire.part.0+0x155/0x370 kernel/locking/lockdep.c:5816
Code: b8 ff ff ff ff 65 0f c1 05 30 13 6d 7e 83 f8 01 0f 85 ca 01 00 00 9c 58 f6 c4 02 0f 85 df 01 00 00 48 85 ed 0f 85 b0 01 00 00 <48> b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00 00 00 00 48 c7
RSP: 0018:ffffc9000631f478 EFLAGS: 00000206
RAX: 0000000000000046 RBX: 1ffff92000c63e90 RCX: 1ffff92000c63e77
RDX: 1ffff1100426a15d RSI: 0000000000000002 RDI: 0000000000000000
RBP: 0000000000000200 R08: 0000000000000000 R09: fffffbfff2d943a0
R10: ffffffff96ca1d07 R11: 0000000000000000 R12: 0000000000000002
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8dfbc0e0
rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
rcu_read_lock_sched include/linux/rcupdate.h:941 [inline]
pfn_valid include/linux/mmzone.h:2067 [inline]
pfn_valid include/linux/mmzone.h:2050 [inline]
page_table_check_clear+0x112/0x9b0 mm/page_table_check.c:70
__page_table_check_pte_clear+0xfc/0x110 mm/page_table_check.c:169
page_table_check_pte_clear include/linux/page_table_check.h:49 [inline]
ptep_get_and_clear_full arch/x86/include/asm/pgtable.h:1337 [inline]
get_and_clear_full_ptes include/linux/pgtable.h:712 [inline]
zap_present_folio_ptes mm/memory.c:1511 [inline]
zap_present_ptes mm/memory.c:1596 [inline]
do_zap_pte_range mm/memory.c:1698 [inline]
zap_pte_range mm/memory.c:1742 [inline]
zap_pmd_range mm/memory.c:1834 [inline]
zap_pud_range mm/memory.c:1863 [inline]
zap_p4d_range mm/memory.c:1884 [inline]
unmap_page_range+0x2db5/0x4270 mm/memory.c:1905
unmap_single_vma+0x19a/0x2b0 mm/memory.c:1951
unmap_vmas+0x1f2/0x440 mm/memory.c:1995
exit_mmap+0x1b4/0xbc0 mm/mmap.c:1284
__mmput+0x128/0x400 kernel/fork.c:1356
mmput+0x60/0x70 kernel/fork.c:1378
exit_mm kernel/exit.c:570 [inline]
do_exit+0x9ae/0x2d00 kernel/exit.c:925
do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
get_signal+0x2278/0x2540 kernel/signal.c:3036
arch_do_signal_or_restart+0x81/0x7d0 arch/x86/kernel/signal.c:337
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xd8/0x250 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f366fbab49e
Code: Unable to access opcode bytes at 0x7f366fbab474.
RSP: 002b:00007f3670accda8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: fffffffffffffff4 RBX: 00000000000014d8 RCX: 00007f366fbab49e
RDX: 0000000020001500 RSI: 0000000020001540 RDI: 00007f3670acce00
RBP: 00007f3670acce40 R08: 00007f3670acce40 R09: 0000000000000000
R10: 0000000000010400 R11: 0000000000000246 R12: 0000000020001500
R13: 0000000020001540 R14: 00007f3670acce00 R15: 0000000020000040
</task>
The buggy address ffffc90006531000 belongs to a vmalloc virtual mapping
Memory state around the buggy address:
ffffc90006530f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90006530f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>ffffc90006531000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffffc90006531080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90006531100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================
----------------
Code disassembly (best guess):
0: b8 ff ff ff ff mov $0xffffffff,%eax
5: 65 0f c1 05 30 13 6d xadd %eax,%gs:0x7e6d1330(%rip) # 0x7e6d133d
c: 7e
d: 83 f8 01 cmp $0x1,%eax
10: 0f 85 ca 01 00 00 jne 0x1e0
16: 9c pushf
17: 58 pop %rax
18: f6 c4 02 test $0x2,%ah
1b: 0f 85 df 01 00 00 jne 0x200
21: 48 85 ed test %rbp,%rbp
24: 0f 85 b0 01 00 00 jne 0x1da
* 2a: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax <-- trapping instruction
31: fc ff df
34: 48 01 c3 add %rax,%rbx
37: 48 c7 03 00 00 00 00 movq $0x0,(%rbx)
3e: 48 rex.W
3f: c7 .byte 0xc7
==================================================================
I use the same kernel as syzbot instance upstream: 7eb172143d5508b4da468ed59ee857c6e5e01da6
kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&amp;x=da4b04ae798b7ef6
compiler: gcc version 11.4.0
===============================================================================
Unfortunately, the modified syzkaller does not generate an effective repeat program.
The following is my analysis of the bug and repair suggestions, hoping to help with the repair of the bug:
Root cause analysis
Trigger path:
The exfat file system frees memory by calling kvfree with the delayed_free function.
kvfree calls vfree and finally adds the memory block to the unchained table (llist_add) via the atomic operation vfree_atomic.
In the linked list operation of llist_add_batch, an out-of-bounds was triggered when the next pointer was written to the address ffffc90006531000.
The root of the problem:
Use After memory release (use-after-free) :
You may try to add memory to a linked list even after it has been freed. For example, struct exfat_sb_info is released in the RCU callback (delayed_free), but the linked list operation still references the members of the struct.
Wrong list node address:
The node pointer passed to llist_add may not point to a valid vmalloc memory region, or the pointer is miscalculated.
### Repair suggestions
1. Ensure that the linked list operation is performed while the memory is active
Problem: llist_add may operate on its node after memory is freed.
Fix: Before freeing memory, make sure it is no longer referenced by the linked list.
Modify delayed_free or other related functions to ensure that the list node is removed before the RCU callback.
// exfat/super.c
static void delayed_free(struct rcu_head *p) {
struct exfat_sb_info *sbi = container_of(p, struct exfat_sb_info, rcu);
+ // Remove a node from the linked list before freeing (assuming similar operations exist)
+ llist_del(&sbi->list_node);
kvfree(sbi);
}
2. Check the validity of the node pointer in the linked list
Problem: The llist_add node pointer may point to an invalid address.
Fix: Verify that the pointer is in the vmalloc region before calling llist_add.
void vfree_atomic(const void *addr)
{
struct vfree_deferred *p = raw_cpu_ptr(&vfree_deferred);
BUG_ON(in_nmi());
kmemleak_free(addr);
/*
* Use raw_cpu_ptr() because this can be called from preemptible
* context. Preemption is absolutely fine here, because the llist_add()
* implementation is lockless, so it works even if we are adding to
* another cpu's list. schedule_work() should be fine with this too.
*/
+ // Check whether addr is in the vmalloc area
+ if (!is_vmalloc_addr(addr)) {
+ WARN_ONCE(1, "vfree_atomic: invalid address %p\n", addr);
+ return;
+ }
if (addr && llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);
}
=========================================================================
I hope it helps.
Best regards
Jianzhou Zhao
xingwei lee
Zhizhuo Tang</strforexctzzchange@...mail.com></xrivendell7@...il.com></xnxc22xnxc22@...com>
Powered by blists - more mailing lists