[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9666935-f7af-4419-bb85-e1f041c64ea9@amd.com>
Date: Wed, 2 Jul 2025 10:02:14 +0530
From: "Aithal, Srikanth" <sraithal@....com>
To: "Chen, Yu C" <yu.c.chen@...el.com>, Jirka Hladky <jhladky@...hat.com>,
Abhigyan ghosh <zscript.team.zs@...il.com>
Cc: linux-kernel@...r.kernel.org, Suneeth D <Suneeth.D@....com>
Subject: Re: [BUG] Kernel panic in __migrate_swap_task() on 6.16-rc2 (NULL
pointer dereference)
On 6/27/2025 1:03 PM, Chen, Yu C wrote:
> On 6/27/2025 3:16 PM, Chen, Yu C wrote:
>> Hi Jirka,
>>
>> On 6/27/2025 5:46 AM, Jirka Hladky wrote:
>>> Hi Chen and all,
>>>
>>> we have now verified that the following commit causes a kernel panic
>>> discussed in this thread:
>>>
>>> ad6b26b6a0a79 sched/numa: add statistics of numa balance task
>>>
>>> Reverting this commit fixes the issue.
>>>
>>> I'm happy to help debug this further or test a proposed fix.
>>>
>>
>> Thanks very much for your report, it seems that there is a
>> race condition that when the swap task candidate was chosen,
>> but its mm_struct get released due to task exit, then later
>> when doing the task swaping, the p->mm is NULL which caused
>> the problem:
>>
>> CPU0 CPU1
>> :
>> ...
>> task_numa_migrate
>> task_numa_find_cpu
>> task_numa_compare
>> # a normal task p is chosen
>> env->best_task = p
>>
>> # p exit:
>> exit_signals(p);
>> p->flags |= PF_EXITING
>> exit_mm
>> p->mm = NULL;
>>
>> migrate_swap_stop
>> __migrate_swap_task((arg->src_task, arg->dst_cpu)
>> count_memcg_event_mm(p->mm, NUMA_TASK_SWAP)# p->mm is NULL
>>
>> Could you please help check if the following debug patch works,
>
> Attached the patch:
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8988d38d46a3..82fc966b390c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3364,7 +3364,12 @@ static void __migrate_swap_task(struct
> task_struct *p, int cpu)
> {
> __schedstat_inc(p->stats.numa_task_swapped);
> count_vm_numa_event(NUMA_TASK_SWAP);
> - count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> + if (unlikely(!p->mm)) {
> + trace_printk("!! (%d %s) flags=%lx\n", p->pid, p->comm,
> + p->flags);
> + } else {
> + count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> + }
>
> if (task_on_rq_queued(p)) {
> struct rq *src_rq, *dst_rq;
I was encountering the same issue as mentioned earlier in this thread,
which has been recurring in our daily linux-next CI builds within our
virtualization CI stream where we observed this BUG appearing randomly
during the runs.
Additionally, we were able to reproduce this issue while running the
autonuma benchmark. As mentioned earlier, the BUG would occur randomly
across iterations, typically between the 5th and 10th iterations.
We consistently encountered this issue up to the
6.16.0-rc4-next-20250630 build
[https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git].
After applying the aforementioned patch ontop of next-20250630 build, I
tested it in our virtualization CI and with the autonuma benchmark
reproducer described below, and the issue no longer occurred. The patch
appears to have resolved the reported problem.
git clone https://github.com/pholasek/autonuma-benchmark.git
cd autonuma-benchmark
for i in $(seq 1 80); do bash ./start_bench.sh -s -t; done
Note: The server running the autonuma-benchmark must have at least two
nodes.
If the provided fix is final, please feel free to include the following
Tested-by tag:
Tested-by: Srikanth Aithal <Srikanth.Aithal@....com>
Tested-by: Suneeth D <Suneeth.D@....com>
Powered by blists - more mailing lists