[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <271c8570.69a28.19c143721c6.Coremail.3230100410@zju.edu.cn>
Date: Sat, 31 Jan 2026 21:21:23 +0800 (GMT+08:00)
From: 余昊铖 <3230100410@....edu.cn>
To: security@...nel.org, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap
Hello,
I would like to report a reference counting vulnerability in the Linux kernel perf_event subsystem, which I discovered using a modified syzkaller-based kernel fuzzing tool that I developed.
Summary
-------
A local user can trigger a reference count saturation or a use-after-free (UAF) vulnerability in the perf_mmap function. This is caused by a race condition where a ring_buffer object's reference count is incremented after it has already reached zero.
The vulnerability exists in the perf_mmap() function in kernel/events/core.c. While the function uses mmap_mutex to protect the initial buffer setup, it performs subsequent operations (such as map_range) on event->rb outside of the locked scope. If the event is closed or the buffer is detached concurrently, the reference count of the ring_buffer can drop to zero, leading to an 'addition on 0' warning or a UAF when the kernel attempts to access or increment it later.
I verified this on Linux kernel version 6.18.5.
Environment
-----------
- Kernel version: 6.18.5 (the complete config is attached)
- Architecture: x86_64
- Hypervisor: QEMU (Standard PC i440FX + PIIX, BIOS 1.13.0-1ubuntu1.1)
Symptoms and logs
-----------------
The kernel triggers a 'refcount_t: addition on 0; use-after-free' warning followed by a memory leak warning
The full report is as below:
audit: type=1400 audit(1769676568.351:202): avc: denied { open } for pid=21484 comm="syz.6.2386" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=perf_event permissive=1
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 3 PID: 21486 at lib/refcount.c:25 refcount_warn_saturate+0x13c/0x1b0 lib/refcount.c:25
Modules linked in:
CPU: 3 UID: 0 PID: 21486 Comm: syz.6.2386 Not tainted 6.18.5 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
RIP: 0010:refcount_warn_saturate+0x13c/0x1b0 lib/refcount.c:25
Code: f0 40 ff 80 3d 70 44 61 03 00 0f 85 52 ff ff ff e8 c9 f0 40 ff c6 05 5e 44 61 03 01 90 48 c7 c7 80 43 5c b8 e8 75 5d 0f ff 90 <0f> 0b 90 90 e9 2f ff ff ff e8 a6 f0 40 ff 80 3d 3d 44 61 03 00 0f
RSP: 0018:ffff888103c17678 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888107641190 RCX: ffffffff8137110c
RDX: 0000000000080000 RSI: ffffc90002279000 RDI: ffff88811b3a3e88
RBP: 0000000000000002 R08: fffffbfff7219644 R09: ffffed10236747d2
R10: ffffed10236747d1 R11: ffff88811b3a3e8b R12: 0000000000000000
R13: ffff888002255a10 R14: ffff888002255a00 R15: ffff888107641170
FS: 00007f7ef46f3640(0000) GS:ffff888160a03000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000478 CR3: 000000010698c006 CR4: 0000000000770ff0
DR0: 0000000000000000 DR1: 00000200000000a2 DR2: 00000200000000a2
DR3: 00000200000000a2 DR6: 00000000ffff0ff0 DR7: 0000000000000600
PKRU: 80000000
Call Trace:
<TASK>
__refcount_add include/linux/refcount.h:289 [inline]
__refcount_inc include/linux/refcount.h:366 [inline]
refcount_inc include/linux/refcount.h:383 [inline]
perf_mmap_rb kernel/events/core.c:7005 [inline]
perf_mmap+0x126d/0x1990 kernel/events/core.c:7163
vfs_mmap include/linux/fs.h:2405 [inline]
mmap_file mm/internal.h:167 [inline]
__mmap_new_file_vma mm/vma.c:2413 [inline]
__mmap_new_vma mm/vma.c:2476 [inline]
__mmap_region+0xea5/0x2250 mm/vma.c:2670
mmap_region+0x267/0x350 mm/vma.c:2740
do_mmap+0x769/0xe50 mm/mmap.c:558
vm_mmap_pgoff+0x1e1/0x330 mm/util.c:581
ksys_mmap_pgoff+0x35d/0x4b0 mm/mmap.c:604
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:89 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:82 [inline]
__x64_sys_mmap+0x116/0x180 arch/x86/kernel/sys_x86_64.c:82
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xac/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7ef5cabb9d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7ef46f2fc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 00007f7ef5f01fa0 RCX: 00007f7ef5cabb9d
RDX: 0000000001000003 RSI: 0000000000002000 RDI: 0000200000ffa000
RBP: 00007f7ef5d2f00a R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000013 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f7ef5f01fac R14: 00007f7ef5f02038 R15: 00007f7ef46f3640
</TASK>
---[ end trace 0000000000000000 ]---
EXT4-fs error (device loop0): ext4_mb_generate_buddy:1303: group 0, block bitmap and bg descriptor inconsistent: 219 vs 12386523 free clusters
EXT4-fs (loop0): Delayed block allocation failed for inode 15 at logical offset 1 with max blocks 2048 with error 28
EXT4-fs (loop0): This should not happen!! Data will be lost
EXT4-fs (loop0): Total free blocks count 0
EXT4-fs (loop0): Free/Dirty block details
EXT4-fs (loop0): free_blocks=12386304
EXT4-fs (loop0): dirty_blocks=16387
EXT4-fs (loop0): Block reservation details
EXT4-fs (loop0): i_reserved_data_blocks=16387
EXT4-fs (loop0): Delayed block allocation failed for inode 15 at logical offset 2052 with max blocks 2048 with error 28
EXT4-fs (loop0): This should not happen!! Data will be lost
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
SYZFAIL: failed to recv rpc
fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
Reproduce
----------
The issue is reproducible using the C reproducer attached. The reproducer triggers the vulnerability by creating a high-frequency race condition between memory mapping and event teardown.
The reproducer follows this execution flow:
1. Event Creation: It initializes a performance monitoring event via perf_event_open(), typically with inherit or specific sample_type flags that necessitate the allocation of a kernel ring_buffer.
2. Multithreaded Hammering: The program spawns multiple threads or forks child processes to perform concurrent operations on the same file descriptor.
3. The Race: Thread A continuously calls mmap() on the perf file descriptor. This enters the kernel-side perf_mmap() function, which briefly acquires the mmap_mutex to set up the buffer but then drops it. While thread B (or the main loop) attempts to close the descriptor or modify the event state, which can trigger the destruction or detachment of the ring_buffer.
4. Vulnerability Trigger: Because perf_mmap() accesses event->rb to perform map_range() after the mmap_mutex has been released, Thread B can drop the buffer's reference count to zero during this unprotected window.
5. Crash/Warning: When Thread A finally reaches the code that increments the reference count or accesses the buffer (e.g., in perf_mmap_rb or map_range), the refcount_t infrastructure detects an "addition on 0," resulting in the KASAN or refcount_warn_saturate report.
Security impact
---------------
The vulnerability allows a local user to compromise system integrity by triggering a reference count saturation or a Use-After-Free (UAF) condition. While the immediate symptom is typically a kernel warning or a Denial of Service through a system hang or panic, especially in environments with panic_on_warn enabled, the underlying memory corruption represents a more significant threat. By causing a ring_buffer object to be accessed after its reference count has reached zero, an attacker may be able to leverage this UAF state to perform heap grooming. If the freed memory is reallocated with a controlled structure, it could potentially be exploited to achieve local privilege escalation, making this a critical issue for multi-user systems or containerized environments where the perf_event interface is accessible.
Patch
--------------
From 34545a4d43adef3147e0ba1c744deb128a05a101 Mon Sep 17 00:00:00 2001
From: 0ne1r0s <yuhaocheng035@...il.com>
Date: Sat, 31 Jan 2026 21:16:52 +0800
Subject: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap
The issue is caused by a race condition between mmap() and event
teardown. In perf_mmap(), the ring_buffer (rb) is accessed via
map_range() after the mmap_mutex is released. If another thread
closes the event or detaches the buffer during this window, the
reference count of rb can drop to zero, leading to a UAF or
refcount saturation when map_range() or subsequent logic attempts
to use it.
Fix this by extending the scope of mmap_mutex to cover the entire
setup process, including map_range(), ensuring the buffer remains
valid until the mapping is complete.
Signed-off-by: 0ne1r0s <yuhaocheng035@...il.com>
---
kernel/events/core.c | 42 +++++++++++++++++++++---------------------
1 file changed, 21 insertions(+), 21 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2c35acc2722b..7c93f7d057cb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
ret = perf_mmap_aux(vma, event, nr_pages);
if (ret)
return ret;
- }
-
- /*
- * Since pinned accounting is per vm we cannot allow fork() to copy our
- * vma.
- */
- vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP);
- vma->vm_ops = &perf_mmap_vmops;
- mapped = get_mapped(event, event_mapped);
- if (mapped)
- mapped(event, vma->vm_mm);
-
- /*
- * Try to map it into the page table. On fail, invoke
- * perf_mmap_close() to undo the above, as the callsite expects
- * full cleanup in this case and therefore does not invoke
- * vmops::close().
- */
- ret = map_range(event->rb, vma);
- if (ret)
- perf_mmap_close(vma);
+ /*
+ * Since pinned accounting is per vm we cannot allow fork() to copy our
+ * vma.
+ */
+ vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP);
+ vma->vm_ops = &perf_mmap_vmops;
+
+ mapped = get_mapped(event, event_mapped);
+ if (mapped)
+ mapped(event, vma->vm_mm);
+
+ /*
+ * Try to map it into the page table. On fail, invoke
+ * perf_mmap_close() to undo the above, as the callsite expects
+ * full cleanup in this case and therefore does not invoke
+ * vmops::close().
+ */
+ ret = map_range(event->rb, vma);
+ if (ret)
+ perf_mmap_close(vma);
+ }
return ret;
}
--
2.51.0
Request
-------
Could you please review this issue and the proposed fix? If this is a confirmed new vulnerability, I would appreciate coordination on a CVE ID.
Best regards,
Haocheng Yu
Zhejiang University
View attachment "repro.c" of type "text/plain" (24256 bytes)
Download attachment ".config" of type "application/octet-stream" (146310 bytes)
Download attachment "perf_mmap.patch" of type "application/octet-stream" (2539 bytes)
Powered by blists - more mailing lists