linux-kernel - Re: RIP: kernel/cgroup/rstat.c:231 cgroup_rstat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <efddfef4-863a-4f5d-a093-eda41168b462@amd.com>
Date: Tue, 22 Apr 2025 09:17:48 +0530
From: "Aithal, Srikanth" <sraithal@....com>
To: "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
 open list <linux-kernel@...r.kernel.org>
Subject: Re: RIP: kernel/cgroup/rstat.c:231 cgroup_rstat_flush+0x4ea/0x7e0

Hello,

I have hit this issue again while running perf regression CI in our 
environment.

This time, the issue was hit on 
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=sched/core&id=c70fc32f44431bb30f9025ce753ba8be25acbba3.

The recreation steps remain the same.

However, manually recreating this issue seems to be difficult. If anyone 
has any hints or tips for recreation, please let me know.

Thanks,

Srikanth Aithal <Srikanth.Aithal@....com>

On 4/9/2025 11:17 AM, Aithal, Srikanth wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Hello,
> 
> The earlier email bounced for some reason. Resending it again, sorry for any inconvenience.
> 
> While performing a kexec with
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=sched/core&id=3e816361e94a0e79b1aabf44abec552e9698b196,
>    warning was hit, followed by an oops and hard lockups.
> 
> Recreation steps where the issue was observed:
> 
> 1. Built and booted
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=sched/core&id=3e816361e94a0e79b1aabf44abec552e9698b196
> with attached kernel config.
> 2. Installed LKP [1] and ran autonuma benchmark via LKP. Below is the job file:
> 
> #! jobs/autonuma-benchmark.yaml
> suite: autonuma-benchmark
> testcase: autonuma-benchmark
> category: benchmark
> autonuma-benchmark:
>     test: numa02SMT_numa01THREAD_ALLOC
>     iterations: 1x
> job_origin: "jobs/autonuma-benchmark.yaml"
> arch: x86_64
> 
> 3. The autonuma job completed successfully. Afterward, the test kexec'd into the same kernel (sched/core[3e816361e9]). During the kexec, the below oops and hard lockups were observed. I have attached the full log.
> 
> 
> [  OK  ] Stopped Remount Root and Kernel File Systems.
> [  OK  ] Stopped Monitoring of LVM2… dmeventd or progress polling.
> [  OK  ] Reached target System Shutdown.
> [  OK  ] Reached target Late Shutdown Services.
>            Starting Reboot via kexec...
> [30880.154233] ------------[ cut here ]------------ [30880.159530] WARNING: CPU: 60 PID: 1740 at kernel/cgroup/rstat.c:231
> cgroup_rstat_flush+0x4ea/0x7e0
> [30880.169808] Modules linked in: xt_CHECKSUM ipt_REJECT ipmi_ssif
> nls_iso8859_1 intel_rapl_msr wmi_bmof amd_atl intel_rapl_common amd64_edac edac_mce_amd kvm_amd rapl joydev input_leds ccp k10temp wmi acpi_ipmi ipmi_si ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr efi_pstore drm autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 hid_generic usbhid ahci ghash_clmulni_intel libahci hid tg3 i2c_piix4 i2c_smbus aesni_intel crypto_simd cryptd [last unloaded: ipmi_devintf] [30880.227826] CPU: 60 UID: 0 PID: 1740 Comm: kworker/60:1 Not tainted
> 6.14.0-3e816361e94a-3e816361e9 #1 PREEMPT(voluntary) [30880.242015] Hardware name: AMD Corporation VOLCANO/VOLCANO, BIOS RVOT1003E 12/11/2024 [30880.250988] Workqueue: cgroup_destroy css_release_work_fn [30880.257172] RIP: 0010:cgroup_rstat_flush+0x4ea/0x7e0
> [30880.262848] Code: 0f 85 e1 fd ff ff 66 90 48 c7 c7 60 ad a7 aa e8 0c 0b f9 00 e8 f7 92 f8 00 85 c0 75 02 f3 90 8b 55 c8 83 c2 01 e9 46 fb ff ff <0f> 0b e9 8e fc ff ff 65 8b 05 0c 40 48 02 48 0f a3 05 b0 d6 f5 01 [30880.284380] RSP: 0018:ff581bf4d4057da0 EFLAGS: 00010046 [30880.290362] RAX: ff8a1bf4b547ed20 RBX: ff8a1bf4b3443fe0 RCX:
> ff3ea4ba8bea4000
> [30880.298536] RDX: ff3ea4ba8bea4000 RSI: 0000000000000100 RDI:
> ff3ea4f9460616b4
> [30880.306703] RBP: ff581bf4d4057e18 R08: ff3ea4ba8ba71000 R09:
> ff3ea4f9460616b4
> [30880.314879] R10: 0000000000000282 R11: ff3ea4f99b6b1000 R12:
> 00000000000000d9
> [30880.323054] R13: 00000000000000d9 R14: ff3ea4ba8ba713e0 R15:
> ff3ea4ba8bea4000
> [30880.331228] FS:  0000000000000000(0000) GS:ff3ea4d9a4771000(0000)
> knlGS:0000000000000000
> [30880.340491] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30880.347071] CR2: 00007f4320220b80 CR3: 00000002eb2b6004 CR4:
> 0000000000771ef0
> [30880.355246] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [30880.363411] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [30880.371585] PKRU: 55555554
> [30880.374677] Call Trace:
> [30880.377469]  <TASK>
> [30880.379865]  css_release_work_fn+0x6a/0x280 [30880.384654]  process_one_work+0x19e/0x3e0 [30880.389244]  worker_thread+0x2ad/0x3c0 [30880.393533]  kthread+0x108/0x220 [30880.397225]  ? __pfx_worker_thread+0x10/0x10 [30880.402113]  ? __pfx_kthread+0x10/0x10 [30880.406402]  ret_from_fork+0x3d/0x60 [30880.410495]  ? __pfx_kthread+0x10/0x10 [30880.414783]  ret_from_fork_asm+0x1a/0x30 [30880.419274]  </TASK> [30880.421767] ---[ end trace 0000000000000000 ]--- [30880.427054] BUG: kernel NULL pointer dereference, address:
> 00000000000003d8
> [30880.435030] #PF: supervisor read access in kernel mode [30880.440914] #PF: error_code(0x0000) - not-present page [30880.446797] PGD 10c466067 P4D 0 [30880.450489] Oops: Oops: 0000 [#1] SMP NOPTI [30880.455275] CPU: 60 UID: 0 PID: 1740 Comm: kworker/60:1 Tainted: G
>       W           6.14.0-3e816361e94a-3e816361e9 #1 PREEMPT(voluntary)
> [30880.469630] Tainted: [W]=WARN
> [30880.473020] Hardware name: AMD Corporation VOLCANO/VOLCANO, BIOS RVOT1003E 12/11/2024 [30880.481993] Workqueue: cgroup_destroy css_release_work_fn [30880.496263] RIP: 0010:cgroup_rstat_flush+0x15f/0x7e0
> [30880.509926] Code: 00 00 0f 87 06 06 00 00 4e 8b 1c dd 80 0f d7 a9 4c 8b 45 d0 4d 01 dd 49 8b 95 a0 00 00 00 49 39 d0 0f 84 2b 05 00 00 4d 63 ec <48> 8b 82 d8 03 00 00 49 81 fd ff 1f 00 00 0f 87 68 05 00 00 4c 01 [30880.547238] RSP: 0018:ff581bf4d4057da0 EFLAGS: 00010086 [30880.560939] RAX: ff8a1bf4b547ed20 RBX: ff8a1bf4b3443fe0 RCX:
> ff3ea4ba8bea4000
> [30880.576665] RDX: 0000000000000000 RSI: 0000000000000100 RDI:
> ff3ea4f9460616b4
> [30880.592201] RBP: ff581bf4d4057e18 R08: ff3ea4ba8ba71000 R09:
> ff3ea4f9460616b4
> [30880.607522] R10: 0000000000000282 R11: ff3ea4f99b6b1000 R12:
> 00000000000000d9
> [30880.622685] R13: 00000000000000d9 R14: ff3ea4ba8ba713e0 R15:
> ff3ea4ba8bea4000
> [30880.637740] FS:  0000000000000000(0000) GS:ff3ea4d9a4771000(0000)
> knlGS:0000000000000000
> [30880.653936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30880.667349] CR2: 00000000000003d8 CR3: 00000002eb2b6004 CR4:
> 0000000000771ef0
> [30880.682262] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [30880.697046] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [30880.711708] PKRU: 55555554
> [30880.721341] Call Trace:
> [30880.730617]  <TASK>
> [30880.739413]  css_release_work_fn+0x6a/0x280 [30880.750598]  process_one_work+0x19e/0x3e0 [30880.761488]  worker_thread+0x2ad/0x3c0 [30880.771986]  kthread+0x108/0x220 [30880.781791]  ? __pfx_worker_thread+0x10/0x10 [30880.792795]  ? __pfx_kthread+0x10/0x10 [30880.803225]  ret_from_fork+0x3d/0x60 [30880.813347]  ? __pfx_kthread+0x10/0x10 [30880.823693]  ret_from_fork_asm+0x1a/0x30 [30880.834279]  </TASK> [30880.842867] Modules linked in: xt_CHECKSUM ipt_REJECT ipmi_ssif
> nls_iso8859_1 intel_rapl_msr wmi_bmof amd_atl intel_rapl_common amd64_edac edac_mce_amd kvm_amd rapl joydev input_leds ccp k10temp wmi acpi_ipmi ipmi_si ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr efi_pstore drm autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 hid_generic usbhid ahci ghash_clmulni_intel libahci hid tg3 i2c_piix4 i2c_smbus aesni_intel crypto_simd cryptd [last unloaded: ipmi_devintf] [30880.936605] CR2: 00000000000003d8 [30880.947860] ---[ end trace 0000000000000000 ]--- [30881.217743] RIP: 0010:cgroup_rstat_flush+0x15f/0x7e0
> [30881.230964] Code: 00 00 0f 87 06 06 00 00 4e 8b 1c dd 80 0f d7 a9 4c 8b 45 d0 4d 01 dd 49 8b 95 a0 00 00 00 49 39 d0 0f 84 2b 05 00 00 4d 63 ec <48> 8b 82 d8 03 00 00 49 81 fd ff 1f 00 00 0f 87 68 05 00 00 4c 01 [30881.268075] RSP: 0018:ff581bf4d4057da0 EFLAGS: 00010086 [30881.281948] RAX: ff8a1bf4b547ed20 RBX: ff8a1bf4b3443fe0 RCX:
> ff3ea4ba8bea4000
> [30881.298015] RDX: 0000000000000000 RSI: 0000000000000100 RDI:
> ff3ea4f9460616b4
> [30881.314116] RBP: ff581bf4d4057e18 R08: ff3ea4ba8ba71000 R09:
> ff3ea4f9460616b4
> [30881.330209] R10: 0000000000000282 R11: ff3ea4f99b6b1000 R12:
> 00000000000000d9
> [30881.346333] R13: 00000000000000d9 R14: ff3ea4ba8ba713e0 R15:
> ff3ea4ba8bea4000
> [30881.362294] FS:  0000000000000000(0000) GS:ff3ea4d9a4771000(0000)
> knlGS:0000000000000000
> [30881.379255] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30881.393613] CR2: 00000000000003d8 CR3: 00000002eb2b6004 CR4:
> 0000000000771ef0
> [30881.409535] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [30881.425335] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [30881.441045] PKRU: 55555554
> [30881.451577] note: kworker/60:1[1740] exited with irqs disabled [30881.465615] note: kworker/60:1[1740] exited with preempt_count 1 [30893.634139] watchdog: CPU156: Watchdog detected hard LOCKUP on cpu 156 [30893.634143] Modules linked in: xt_CHECKSUM ipt_REJECT ipmi_ssif
> nls_iso8859_1 intel_rapl_msr wmi_bmof amd_atl intel_rapl_common amd64_edac edac_mce_amd kvm_amd rapl joydev input_leds ccp k10temp wmi acpi_ipmi ipmi_si ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr efi_pstore drm autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 hid_generic usbhid ahci ghash_clmulni_intel libahci hid tg3 i2c_piix4 i2c_smbus aesni_intel crypto_simd cryptd [last unloaded: ipmi_devintf] [30893.634169] CPU: 156 UID: 0 PID: 91861 Comm: kworker/156:0 Tainted: G
>        D W           6.14.0-3e816361e94a-3e816361e9 #1 PREEMPT(voluntary)
> [30893.634173] Tainted: [D]=DIE, [W]=WARN [30893.634174] Hardware name: AMD Corporation VOLCANO/VOLCANO, BIOS RVOT1003E 12/11/2024 [30893.634175] Workqueue: cgroup_destroy css_free_rwork_fn [30893.634183] RIP: 0010:native_queued_spin_lock_slowpath+0x80/0x300
> [30893.634191] Code: 2c 24 08 0f 92 c2 41 8b 04 24 0f b6 d2 c1 e2 08 30
> e4 09 d0 a9 00 01 ff ff 75 69 85 c0 74 14 41 0f b6 04 24 84 c0 74 0b f3
> 90 <41> 0f b6 04 24 84 c0 75 f5 b8 01 00 00 00 66 41 89 04 24 5b 41 5c [30893.634192] RSP: 0018:ff581bf510c0fd08 EFLAGS: 00000002 [30893.634194] RAX: 0000000000000001 RBX: ff8a1bf4b747b8a0 RCX:
> 0000000000000000
> [30893.634195] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> ff3ea4f9460616b4
> [30893.634196] RBP: ff581bf510c0fd30 R08: 00000000000000d9 R09:
> ff3ea4f9460616b4
> [30893.634197] R10: 0000000000000286 R11: 00000000000000d9 R12:
> ff3ea4f9460616b4
> [30893.634198] R13: 0000000000000286 R14: ff3ea4bad765f000 R15:
> 0000000000000000
> [30893.634198] FS:  0000000000000000(0000) GS:ff3ea4d9a4f71000(0000)
> knlGS:0000000000000000
> [30893.634199] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30893.634200] CR2: 00007f4320220b80 CR3: 000000404a660001 CR4:
> 0000000000771ef0
> [30893.634201] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [30893.634202] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [30893.634202] PKRU: 55555554
> [30893.634203] Call Trace:
> [30893.634205]  <TASK>
> [30893.634208]  _raw_spin_lock_irqsave+0x46/0x60 [30893.634211]  cgroup_rstat_flush+0xf4/0x7e0 [30893.634214]  cgroup_rstat_exit+0x20/0xf0 [30893.634215]  css_free_rwork_fn+0x12e/0x400 [30893.634216]  process_one_work+0x19e/0x3e0 [30893.634221]  worker_thread+0x2ad/0x3c0 [30893.634223]  kthread+0x108/0x220 [30893.634225]  ? __pfx_worker_thread+0x10/0x10 [30893.634227]  ? __pfx_kthread+0x10/0x10 [30893.634228]  ret_from_fork+0x3d/0x60 [30893.634233]  ? __pfx_kthread+0x10/0x10 [30893.634234]  ret_from_fork_asm+0x1a/0x30 [30893.634239]  </TASK> [30940.158820] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [30940.172867] rcu:     156-...0: (0 ticks this GP)
> idle=e8f4/1/0x4000000000000000 softirq=427128/427128 fqs=4630
> [30940.191257] rcu:     (detected by 12, t=15010 jiffies, g=639685,
> q=26165 ncpus=256)
> [30940.207160] Sending NMI from CPU 12 to CPUs 156:
> [30940.207166] NMI backtrace for cpu 156 [30940.207169] CPU: 156 UID: 0 PID: 91861 Comm: kworker/156:0 Tainted: G
>        D W           6.14.0-3e816361e94a-3e816361e9 #1 PREEMPT(voluntary)
> [30940.207172] Tainted: [D]=DIE, [W]=WARN [30940.207173] Hardware name: AMD Corporation VOLCANO/VOLCANO, BIOS RVOT1003E 12/11/2024 [30940.207175] Workqueue: cgroup_destroy css_free_rwork_fn [30940.207177] RIP: 0010:native_queued_spin_lock_slowpath+0x80/0x300
> [30940.207180] Code: 2c 24 08 0f 92 c2 41 8b 04 24 0f b6 d2 c1 e2 08 30
> e4 09 d0 a9 00 01 ff ff 75 69 85 c0 74 14 41 0f b6 04 24 84 c0 74 0b f3
> 90 <41> 0f b6 04 24 84 c0 75 f5 b8 01 00 00 00 66 41 89 04 24 5b 41 5c [30940.207181] RSP: 0018:ff581bf510c0fd08 EFLAGS: 00000002 [30940.207182] RAX: 0000000000000001 RBX: ff8a1bf4b747b8a0 RCX:
> 0000000000000000
> [30940.207183] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> ff3ea4f9460616b4
> [30940.207184] RBP: ff581bf510c0fd30 R08: 00000000000000d9 R09:
> ff3ea4f9460616b4
> [30940.207185] R10: 0000000000000286 R11: 00000000000000d9 R12:
> ff3ea4f9460616b4
> [30940.207186] R13: 0000000000000286 R14: ff3ea4bad765f000 R15:
> 0000000000000000
> [30940.207187] FS:  0000000000000000(0000) GS:ff3ea4d9a4f71000(0000)
> knlGS:0000000000000000
> [30940.207188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30940.207188] CR2: 00007f4320220b80 CR3: 000000404a660001 CR4:
> 0000000000771ef0
> [30940.207189] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [30940.207190] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [30940.207191] PKRU: 55555554
> [30940.207191] Call Trace:
> [30940.207192]  <TASK>
> [30940.207193]  _raw_spin_lock_irqsave+0x46/0x60 [30940.207195]  cgroup_rstat_flush+0xf4/0x7e0 [30940.207197]  cgroup_rstat_exit+0x20/0xf0 [30940.207198]  css_free_rwork_fn+0x12e/0x400 [30940.207199]  process_one_work+0x19e/0x3e0 [30940.207202]  worker_thread+0x2ad/0x3c0 [30940.207204]  kthread+0x108/0x220 [30940.207205]  ? __pfx_worker_thread+0x10/0x10 [30940.207207]  ? __pfx_kthread+0x10/0x10 [30940.207208]  ret_from_fork+0x3d/0x60 [30940.207209]  ? __pfx_kthread+0x10/0x10 [30940.207210]  ret_from_fork_asm+0x1a/0x30 [30940.207213]  </TASK> [30970.132112] shutdown[1]: Unmounting '/oldroot' timed out, issuing SIGKILL to PID 147876.
> [30970.147562] shutdown[1]: Not all file systems unmounted, 1 left.
> [30970.160653] shutdown[1]: Deactivating swaps.
> [30970.171832] shutdown[1]: All swaps deactivated.
> [30970.183280] shutdown[1]: Detaching loop devices.
> [30970.195215] shutdown[1]: All loop devices detached.
> [30970.207077] shutdown[1]: Stopping MD devices.
> [30970.218370] shutdown[1]: All MD devices stopped.
> [30970.229857] shutdown[1]: Detaching DM devices.
> [30970.240989] shutdown[1]: All DM devices detached.
> [30970.252245] shutdown[1]: Unmounting file systems.
> [30970.263405] shutdown[1]: All filesystems unmounted.
> [30970.274714] shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
> [30970.296039] shutdown[1]: Syncing filesystems and block devices.
> [31000.308656] shutdown[1]: Syncing filesystems and block devices - timed out, issuing SIGKILL to PID 147892.
> [31000.325333] shutdown[1]: Rebooting with kexec.
> [31086.440252] watchdog: CPU232: Watchdog detected hard LOCKUP on cpu 232
> 
> 
> No luck in recreating the issue yet. In addition to the test scenario mentioned above, I also tried running stress-ng with cgroup stress for some time and performing kexec in a loop, but I could not recreate the issue. I also see similar issues reported here: [2], [3].
> 
> If there are any pointers to help with recreation, I would be happy to try them out and report back here.
> 
> 
> [1] https://github.com/intel/lkp-tests.git
> [2]
> https://lore.kernel.org/all/6564c3d6-9372-4352-9847-1eb3aea07ca4@linux.ibm.com/
> [3]
> https://lore.kernel.org/all/tencent_084EDA1878C098FFB951DC70F6FFCC896408@qq.com/
> 
> Thank you,
> Srikanth Aithal
>