linux-kernel - Re: [BUG] Kernel panic in __migrate_swap_task() on 6.16-rc2 (NULL pointer dereference)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e9666935-f7af-4419-bb85-e1f041c64ea9@amd.com>
Date: Wed, 2 Jul 2025 10:02:14 +0530
From: "Aithal, Srikanth" <sraithal@....com>
To: "Chen, Yu C" <yu.c.chen@...el.com>, Jirka Hladky <jhladky@...hat.com>,
 Abhigyan ghosh <zscript.team.zs@...il.com>
Cc: linux-kernel@...r.kernel.org, Suneeth D <Suneeth.D@....com>
Subject: Re: [BUG] Kernel panic in __migrate_swap_task() on 6.16-rc2 (NULL
 pointer dereference)

On 6/27/2025 1:03 PM, Chen, Yu C wrote:
> On 6/27/2025 3:16 PM, Chen, Yu C wrote:
>> Hi Jirka,
>>
>> On 6/27/2025 5:46 AM, Jirka Hladky wrote:
>>> Hi Chen and all,
>>>
>>> we have now verified that the following commit causes a kernel panic
>>> discussed in this thread:
>>>
>>> ad6b26b6a0a79 sched/numa: add statistics of numa balance task
>>>
>>> Reverting this commit fixes the issue.
>>>
>>> I'm happy to help debug this further or test a proposed fix.
>>>
>>
>> Thanks very much for your report, it seems that there is a
>> race condition that when the swap task candidate was chosen,
>> but its mm_struct get released due to task exit, then later
>> when doing the task swaping, the p->mm is NULL which caused
>> the problem:
>>
>> CPU0                                   CPU1
>> :
>> ...
>> task_numa_migrate
>>    task_numa_find_cpu
>>     task_numa_compare
>>       # a normal task p is chosen
>>       env->best_task = p
>>
>>                                         # p exit:
>>                                         exit_signals(p);
>>                                            p->flags |= PF_EXITING
>>                                         exit_mm
>>                                            p->mm = NULL;
>>
>>     migrate_swap_stop
>>       __migrate_swap_task((arg->src_task, arg->dst_cpu)
>>        count_memcg_event_mm(p->mm, NUMA_TASK_SWAP)# p->mm is NULL
>>
>> Could you please help check if the following debug patch works,
> 
> Attached the patch:
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8988d38d46a3..82fc966b390c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3364,7 +3364,12 @@ static void __migrate_swap_task(struct 
> task_struct *p, int cpu)
>   {
>       __schedstat_inc(p->stats.numa_task_swapped);
>       count_vm_numa_event(NUMA_TASK_SWAP);
> -    count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> +    if (unlikely(!p->mm)) {
> +        trace_printk("!! (%d %s) flags=%lx\n", p->pid, p->comm,
> +                p->flags);
> +    } else {
> +        count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> +    }
> 
>       if (task_on_rq_queued(p)) {
>           struct rq *src_rq, *dst_rq;

I was encountering the same issue as mentioned earlier in this thread, 
which has been recurring in our daily linux-next CI builds within our 
virtualization CI stream where we observed this BUG appearing randomly 
during the runs.

Additionally, we were able to reproduce this issue while running the 
autonuma benchmark. As mentioned earlier, the BUG would occur randomly 
across iterations, typically between the 5th and 10th iterations.

We consistently encountered this issue up to the 
6.16.0-rc4-next-20250630 build 
[https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git].

After applying the aforementioned patch ontop of next-20250630 build, I 
tested it in our virtualization CI and with the autonuma benchmark 
reproducer described below, and the issue no longer occurred. The patch 
appears to have resolved the reported problem.


git clone https://github.com/pholasek/autonuma-benchmark.git
cd autonuma-benchmark
for i in $(seq 1 80); do bash ./start_bench.sh -s -t; done
Note: The server running the autonuma-benchmark must have at least two 
nodes.

If the provided fix is final, please feel free to include the following 
Tested-by tag:

Tested-by: Srikanth Aithal <Srikanth.Aithal@....com>
Tested-by: Suneeth D <Suneeth.D@....com>