[<prev] [next>] [day] [month] [year] [list]
Message-ID: <DB8FFA27-CA9D-4693-9917-5478A0940C79@gmail.com>
Date: Thu, 19 Jun 2025 19:53:34 +0530
From: Abhigyan ghosh <zscript.team.zs@...il.com>
To: Jirka Hladky <jhladky@...hat.com>
CC: linux-kernel@...r.kernel.org
Subject: Re: Kernel panic in __migrate_swap_task() under stress-ng
Thanks Jirka for the clarification!
That makes sense. I initially thought the crash location might correlate with test order, but now I understand it's alphabetical. I’ll focus more on which tests were active when the crash occurred (like ‘sem’ and ‘fork’) rather than their position.
Appreciate your help again!
— Abhigyan
On 19 June 2025 5:31:28 pm IST, Jirka Hladky <jhladky@...hat.com> wrote:
>Thank you, Abhigyan!
>
>often crashing around test 30+ out of 41,
>
>This is not relevant. We run 41 different benchmarks from Libmicro and
>order them alphabetically, so test #30 has no special meaning.
>
> Let’s see if I can narrow it down further. If I get a hit, I’ll share the
>> trace.
>
>Keeping my fingers crossed!
>
>Jirka
>
>
>On Thu, Jun 19, 2025 at 7:14 AM Abhigyan ghosh <zscript.team.zs@...il.com>
>wrote:
>
>>
>>
>> Hi Jirka,
>>
>> Thanks again for the detailed logs and clarification.
>>
>> Based on your trace and timing (often crashing around test 30+ out of 41,
>> after long runtime), I suspect it could be a use-after-free or delayed
>> wake-up race in the CPU stopper thread handling.
>>
>> In particular, I noticed:
>>
>> The RIP __migrate_swap_task+0x2f attempts to dereference +0x4c8 from a
>> NULL task_struct pointer.
>>
>> That offset is near task->se.cfs_rq or task->sched_info on some
>> architectures — which makes me wonder if the task was already de-queued
>> from its CPU’s rq during swap or sem cleanup.
>>
>> Since stress-ng uses short timed sem/fork loops with varying threads,
>> maybe the task was migrated mid-finalization?
>>
>>
>> As an experiment, I’ll try:
>>
>> Looping stress-ng --sem --taskset 0-15
>>
>> Watching perf top and tracing with ftrace on migrate_swap_stop and
>> task_rq_lock
>>
>>
>> Let’s see if I can narrow it down further. If I get a hit, I’ll share the
>> trace.
>>
>> Thanks again —
>> Best regards,
>> Abhigyan Ghosh
>> zsml.zscript.org
>>
>> aghosh
>>
>>
>
aghosh
Powered by blists - more mailing lists