linux-kernel - Re: [BUG][6.15][perf] Kernel panic not syncing: Fatal exception in interrupt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83b4d26.3362.19730a21115.Coremail.00107082@163.com>
Date: Mon, 2 Jun 2025 20:33:37 +0800 (CST)
From: "David Wang" <00107082@....com>
To: yeoreum.yun@....com
Cc: peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
	namhyung@...nel.org, mingo@...nel.org, yeoreum.yun@....com,
	leo.yan@....com, linux-perf-users@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [BUG][6.15][perf] Kernel panic not syncing: Fatal exception in
 interrupt



At 2025-06-02 20:06:10, "Yeoreum Yun" <> wrote:
>Sorry to make noise all.
>I've forgotten to cc mailing list.
>If you receive duplicate mail, Sorry again...
>
>====================================================
>Hi David,
>
>> Hi,
>>
>> Caught a kernel panic when rebooting, system stuck until pressing power button.
>> I have only a screenshot when it happens, following logs were extracted from a
>> captured picture.
>>
>> 863.881960] sysved_call_function_sing le+0x4c/0xc0
>> 863.881301] asm_sysvec_call_function_single+0x16/0x20
>> 869.881344] RIP: 0633:0x7f9alcea3367
>> 663.681373] Code: 00 66 99 b8 ff ff ff ff c3 66 ....
>> 863.881524] RSP: 002b:00007fffa526fcf8 EFLAGS: 00000246
>> 869.881567] RAX: 0000562060c962d0 RBX: 0000000000000002 RCX: 00007f9a1cff1c60
>> 863.881625] RDX: 00007f9a0c000030 RSI: 00007f9alcff1c60 RDI: 00007f9a1ca91c20
>> 863.081682] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f9a1d6217a0
>> 869.881740] R10: 00007f9alca91c10 R11: 0000000000000246 R12: 00007f9a1d70c020
>> 869.881798] R13: 00007fffa5270030 R14: 00007fffa526fd00 R15: 0000000000000000
>> 863.881860] </TASK>
>> 863.881876) Modules linked in: snd_seq_dummy (E) snd_hrtimer (E)...
>> ...
>> 863.887142] button (E)
>> 863.912127] CR2: ffffe4afcc079650
>> 863.914593] --- [ end trace 0000000000000000 1--
>> 864.042750] RIP: 0010:ctx_sched_out+0x1ce/0x210
>> 864.045214] Code: 89 c6 4c 8b b9 de 00 00 00 48 ...
>> 864.050343] RSP: 0000:ffffaa4ec0f3fe60 EFLAGS: 00010086
>> 864.052929] RAX: 0000000000000002 RBX: ffff8e8eeed2a580 RCX: ffff8e8bded9bf00
>> 864.055518] RDX: 000000c92340b051 RSI: 000000c92340b051 RDI: ffff
>> 864.058093] RBP: 0000000000000000 R08: 0000000000000002 R09: 00
>> 864.060654] R10: 0000000000000000 R11: 0000000000000000 R12: 000
>> 864.063183] R13: ffff8e8eeed2a580 R14: 0000000000000007 R15: ffffe4afcc079650
>> 864.065729] FS: 00007f9a1ca91940 (0000) GS:ffff8e8f6b1c3000(0000) knIGS:0000000000000000
>> 864.068312] CS: 0010 DS: 0000 ES: 0000 CRO: 0000000080050033
>> 864.070898] CR2: ffffe4afcc079650 CR3: 00000001136d8000 CR4: 0000000000350ef0
>> 864.673523] Kernel panic - not syncing: Fatal exception in interrupt
>> 864.076410] Kernel Offset: 0xc00000 from 0xffffffff81000000 (relocation range: 0xff
>> 864.205401] --- [ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>>
>> This happens ever since 6.15-rc1, from time to time, I would get kernel panic when
>> reboot; it is only recently that I figured out a precedure reproducing
>> this with *high* probability:
>>
>> 1. create a cgroup.
>> 2. perf_event_open(PERF_FLAG_FD_CLOEXEC|PERF_FLAG_PID_CGROUP) for each cpu with following attrs:
>> 	attr.type = PERF_TYPE_SOFTWARE;
>> 	attr.size = sizeof(attr);
>> 	attr.config = PERF_COUNT_SW_CPU_CLOCK;
>> 	attr.sample_freq = 9999;
>> 	attr.freq = 1;
>> 	attr.wakeup_events = 16;
>> 	attr.sample_type = PERF_SAMPLE_CALLCHAIN;
>> 	attr.sample_max_stack = 32;
>> 	attr.exclude_callchain_user = 1;
>> 3. close all perf_event_open after several minutes
>> 4. reboot
>>
>> And after an exhausting bisect on events/core.c, (I need 5 rounds to conclude a good bisect)
>> I think I reach the conclusion, with very high probability, that this is caused by
>>
>> commit a3c3c66670cee11eb13aa43905904bf29cb92d32
>> Author: Yeoreum Yun <yeoreum.yun@....com>
>> Date:   Wed Mar 26 08:20:03 2025 +0000
>>
>>    perf/core: Fix child_total_time_enabled accounting bug at task exit
>>
>> Reverting this can fix it: I run the test 10 rounds, no kernel panic observed.
>>
>> The changes made to __perf_remove_from_context by commit a3c3c6667("perf/core:
>> Fix child_total_time_enabled accounting bug at task exit") has wider effect
>> than the callchain mentioned in commit message, and I think an esay fix would
>> be just restricting the effect to that callchain only, and restore other changes back.
>>
>> I have test the patch below several rounds, and so far so good, and I will have
>> more tests on it.
>>
>> Signed-off-by: David Wang <00107082@....com>
>
>Thanks for your reporting and Sorry for my bad.
>By my change, the tracking nr_cgorups is broken which could make a dangling
>pointer for cpuctx->cgrp.
>
>Could you test with below change please?
>
>diff --git a/kernel/events/core.c b/kernel/events/core.c
>index 95e703891b24..d0a9096735b9 100644
>--- a/kernel/events/core.c
>+++ b/kernel/events/core.c
>@@ -2116,18 +2116,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
>        if (event->group_leader == event)
>                del_event_from_groups(event, ctx);
>
>-       /*
>-        * If event was in error state, then keep it
>-        * that way, otherwise bogus counts will be
>-        * returned on read(). The only way to get out
>-        * of error state is by explicit re-enabling
>-        * of the event
>-        */
>-       if (event->state > PERF_EVENT_STATE_OFF) {
>-               perf_cgroup_event_disable(event, ctx);
>-               perf_event_set_state(event, PERF_EVENT_STATE_OFF);
>-       }
>-
>        ctx->generation++;
>        event->pmu_ctx->nr_events--;
> }
>@@ -2471,6 +2459,16 @@ __perf_remove_from_context(struct perf_event *event,
>
>        ctx_time_update(cpuctx, ctx);
>
>+       /*
>+        * If event was in error state, then keep it
>+        * that way, otherwise bogus counts will be
>+        * returned on read(). The only way to get out
>+        * of error state is by explicit re-enabling
>+        * of the event
>+        */
>+       if (event->state > PERF_EVENT_STATE_OFF)
>+               perf_cgroup_event_disable(event, ctx);
>+
>        /*
>         * Ensure event_sched_out() switches to OFF, at the very least
>         * this avoids raising perf_pending_task() at this time.
>
>Thanks
>
>
>--
>Sincerely,
>Yeoreum Yun

Before I start testing, I feel concerned about following chain:

./kernel/fork.c:
bad_fork_cleanup_perf:
    perf_event_free_task() 
        perf_free_event()
            list_del_event()

This patch seems changes the behavior in this callchain.
Would this have other side-effect?


David