[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bd936eba-e536-4825-ae64-d1bd23c6eb4c@intel.com>
Date: Mon, 5 May 2025 23:03:10 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: "Jain, Ayush" <ayushjai@....com>, Andrew Morton
<akpm@...ux-foundation.org>
CC: Ingo Molnar <mingo@...hat.com>, Tejun Heo <tj@...nel.org>, Johannes Weiner
<hannes@...xchg.org>, Jonathan Corbet <corbet@....net>, Mel Gorman
<mgormanmgorman@...e.de>, Michal Hocko <mhocko@...nel.org>, Michal Koutny
<mkoutny@...e.com>, Muchun Song <muchun.song@...ux.dev>, Roman Gushchin
<roman.gushchin@...ux.dev>, Shakeel Butt <shakeel.butt@...ux.dev>, "Chen, Tim
C" <tim.c.chen@...el.com>, Aubrey Li <aubrey.li@...el.com>, Libo Chen
<libo.chen@...cle.com>, <cgroups@...r.kernel.org>,
<linux-doc@...r.kernel.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, K Prateek Nayak <kprateek.nayak@....com>,
Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, <Neeraj.Upadhyay@....com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v3] sched/numa: add statistics of numa balance task
migration
On 5/5/2025 2:43 PM, Jain, Ayush wrote:
>
> Hello,
>
> Hitting Kernel Panic on latest-next while running rcutorture tests
>
> 37ff6e9a2ce3 ("Add linux-next specific files for 20250502")
>
> reverting this patch fixes it
> 3b2339eeb032 ("sched-numa-add-statistics-of-numa-balance-task-migration-v3")
> https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/kernel/sched/core.c?id=3b2339eeb032627e9329daf70a4ba8cd62c9cc8d
>
> by looking at RIP pointer
>
> $ ./scripts/faddr2line vmlinux __migrate_swap_task+0x2e/0x180
> __migrate_swap_task+0x2e/0x180:
> count_memcg_events_mm at include/linux/memcontrol.h:987
> (inlined by) count_memcg_events_mm at include/linux/memcontrol.h:978
> (inlined by) __migrate_swap_task at kernel/sched/core.c:3356
>
> memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
> mm->owner -> NULL
>
> Attaching kernel logs below:
>
> [ 1070.635450] rcu-torture: rcu_torture_read_exit: End of episode
> [ 1074.047617] BUG: kernel NULL pointer dereference, address:
> 0000000000000498
Thanks Ayush,
According to this address,
4c 8b af 50 09 00 00 mov 0x950(%rdi),%r13 <--- r13 = p->mm;
49 8b bd 98 04 00 00 mov 0x498(%r13),%rdi <--- p->mm->owner
It seems that this task to be swapped has NULL mm_struct.
Does the following help?
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 96db6947bc92..0cb8cc4d551d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3353,7 +3353,8 @@ void set_task_cpu(struct task_struct *p, unsigned
int new_cpu)
static void __migrate_swap_task(struct task_struct *p, int cpu)
{
__schedstat_inc(p->stats.numa_task_swapped);
- count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
+ if (p->mm)
+ count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
if (task_on_rq_queued(p)) {
struct rq *src_rq, *dst_rq;
Hi Andrew,
May I know if we can hold this patch and not merge it for now,
besides this regression, Libo has another comment related to
this patch and I'll address it in next version. Sorry for
inconvenience.
thanks,
Chenyu
Powered by blists - more mailing lists