lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAKHoSAvi5LZENKr4dKqwbpQ3dXrDS_3=O-d43BpptZTVipehrQ@mail.gmail.com>
Date: Fri, 3 Jan 2025 16:28:30 +0800
From: cheung wall <zzqq0103.hey@...il.com>
To: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Borislav Petkov <bp@...en8.de>, 
	"Liang, Kan" <kan.liang@...ux.intel.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc: Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, 
	"H. Peter Anvin" <hpa@...or.com>, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: "INFO: rcu detected stall in corrupted" in Linux kernel version 6.13.0-rc2

Hello,

I am writing to report a potential vulnerability identified in the
Linux Kernel version 6.13.0-rc2. This issue was discovered using our
custom vulnerability discovery tool.

HEAD commit: fac04efc5c793dccbd07e2d59af9f90b7fc0dca4 (tag: v6.13-rc2)

Affected File: arch/x86/events/core.c

File: arch/x86/events/core.c

Function: x86_pmu_enable_event

Detailed Call Stack:

------------[ cut here begin]------------

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
3-...D } 10 jiffies s: 121 root: 0x8/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 0 to CPUs 3:
loop4: detected capacity change from 0 to 1024
EXT4-fs: Ignoring removed oldalloc option
NMI backtrace for cpu 3
CPU: 3 UID: 0 PID: 4204 Comm: syz-executor.6 Not tainted
6.13.0-rc2-00159-gf932fb9b4074 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:arch_static_branch arch/x86/include/asm/jump_label.h:36 [inline]
RIP: 0010:native_write_msr arch/x86/include/asm/msr.h:149 [inline]
RIP: 0010:wrmsrl arch/x86/include/asm/msr.h:264 [inline]
RIP: 0010:__x86_pmu_enable_event arch/x86/events/perf_event.h:1219 [inline]
RIP: 0010:x86_pmu_enable_event+0x126/0x2a0 arch/x86/events/core.c:1430
Code: 81 cd 00 00 40 00 48 c1 ea 03 4c 21 e5 80 3c 02 00 0f 85 64 01
00 00 8b 9b 78 01 00 00 48 89 ea 89 e8 48 c1 ea 20 89 d9 0f 30 <66> 90
5b 5d 41 5c 41 5d 41 5e 41 5f e9 a9 c2 45 00 e8 a4 c2 45 00
RSP: 0018:ffff8881058079d8 EFLAGS: 00000056
RAX: 0000000000530076 RBX: 00000000c0010200 RCX: 00000000c0010200
RDX: 0000000000000000 RSI: ffffc90008bc0000 RDI: ffff888106bb2d10
RBP: 0000000000530076 R08: 0000000000000000 R09: ffffed1022ab4484
R10: ffff8881155a2427 R11: 0000000000032001 R12: fffffdffffffffff
R13: 0000000000000000 R14: dffffc0000000000 R15: 0000000000000000
FS: 00007f44665b6640(0000) GS:ffff888115580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2dd24000 CR3: 0000000107544000 CR4: 0000000000350ef0
Call Trace:
<NMI>
</NMI>
WARNING: stack recursion on stack type 6
perf: interrupt took too long (3192 > 3161), lowering
kernel.perf_event_max_sample_rate to 62000
lo: entered promiscuous mode
lo: entered allmulticast mode
netlink: 'syz-executor.2': attribute type 12 has an invalid length.
EXT4-fs (loop4): mounted filesystem
00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode:
none.
perf: interrupt took too long (4025 > 3990), lowering
kernel.perf_event_max_sample_rate to 49000
netlink: 3 bytes leftover after parsing attributes in process `syz-executor.6'.
perf: interrupt took too long (5057 > 5031), lowering
kernel.perf_event_max_sample_rate to 39000
perf: interrupt took too long (6333 > 6321), lowering
kernel.perf_event_max_sample_rate to 31000
ICMPv6: process `sh' is using deprecated sysctl (syscall)
net.ipv6.neigh.lo.base_reachable_time - use
net.ipv6.neigh.lo.base_reachable_time_ms instead
EXT4-fs (loop4): unmounting filesystem 00000000-0000-0000-0000-000000000000.
sg_write: data in/out 2031580/78 bytes for SCSI command 0x0-- guessing data in;
program syz-executor.3 not setting count and/or reply_len properly
loop1: detected capacity change from 0 to 8192
loop1: p1 p2 p3
netlink: 'syz-executor.1': attribute type 1 has an invalid length.
loop1: p1 p2 p3
cgroup: Need name or subsystem set
loop0: detected capacity change from 0 to 1164
loop7: detected capacity change from 0 to 262144
loop1: detected capacity change from 0 to 1024
EXT4-fs (loop1): Invalid log block size: 48537


------------[ cut here end]------------

Root Cause:

The crash is primarily triggered by a malfunction within the Linux
kernel's Performance Monitoring Unit (PMU) subsystem, specifically in
the x86_pmu_enable_event function located in arch/x86/events/core.c. A
bug in this function leads to improper handling of performance events,
which in turn causes significant RCU (Read-Copy Update) stalls across
multiple CPUs. These stalls are exacerbated by excessive interrupt
handling delays, as evidenced by multiple perf: interrupt took too
long warnings. Additionally, the system experiences stack recursion
issues, further destabilizing kernel operations. The combination of
these factors results in severe synchronization and resource
management failures, culminating in a kernel panic and system crash.
Other contributing factors observed in the logs, such as EXT4
filesystem anomalies and netlink attribute parsing errors, indicate
that the PMU issue may be part of a broader set of kernel instability
problems.

Thank you for your time and attention.

Best regards

Wall

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ