[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh7bymlg2f.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Fri, 29 Aug 2025 09:53:12 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>, Shrikanth Hegde
<sshegde@...ux.ibm.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>, Dietmar Eggemann
<dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, Ben
Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
thomas.weissschuh@...utronix.de, Li Chen <chenl311@...natelecom.cn>, Bibo
Mao <maobibo@...ngson.cn>, Mete Durlu <meted@...ux.ibm.com>, Tobias
Huschle <huschle@...ux.ibm.com>, Easwar Hariharan
<easwar.hariharan@...ux.microsoft.com>, Guo Weikang
<guoweikang.kernel@...il.com>, "Rafael J. Wysocki"
<rafael.j.wysocki@...el.com>, Brian Gerst <brgerst@...il.com>, Patryk
Wlazlyn <patryk.wlazlyn@...ux.intel.com>, Swapnil Sapkal
<swapnil.sapkal@....com>, "Yury Norov [NVIDIA]" <yury.norov@...il.com>,
Sudeep Holla <sudeep.holla@....com>, Jonathan Cameron
<Jonathan.Cameron@...wei.com>, Andrea Righi <arighi@...dia.com>, Yicong
Yang <yangyicong@...ilicon.com>, Ricardo Neri
<ricardo.neri-calderon@...ux.intel.com>, Tim Chen
<tim.c.chen@...ux.intel.com>, Vinicius Costa Gomes
<vinicius.gomes@...el.com>, Madhavan Srinivasan <maddy@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin
<npiggin@...il.com>, Christophe Leroy <christophe.leroy@...roup.eu>, Heiko
Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>, Alexander
Gordeev <agordeev@...ux.ibm.com>, Christian Borntraeger
<borntraeger@...ux.ibm.com>, Sven Schnelle <svens@...ux.ibm.com>, Thomas
Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav
Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>, Juri Lelli
<juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH v7 0/8] sched/fair: Get rid of sched_domains_curr_level
hack for tl->cpumask()
On 26/08/25 12:13, Peter Zijlstra wrote:
> Subject: sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask()
> From: Peter Zijlstra <peterz@...radead.org>
> Date: Mon, 25 Aug 2025 12:02:44 +0000
>
> Leon [1] and Vinicius [2] noted a topology_span_sane() warning during
> their testing starting from v6.16-rc1. Debug that followed pointed to
> the tl->mask() for the NODE domain being incorrectly resolved to that of
> the highest NUMA domain.
>
> tl->mask() for NODE is set to the sd_numa_mask() which depends on the
> global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
> set to the "tl->numa_level" during tl traversal in build_sched_domains()
> calling sd_init() but was not reset before topology_span_sane().
>
> Since "tl->numa_level" still reflected the old value from
> build_sched_domains(), topology_span_sane() for the NODE domain trips
> when the span of the last NUMA domain overlaps.
>
> Instead of replicating the "sched_domains_curr_level" hack, get rid of
> it entirely and instead, pass the entire "sched_domain_topology_level"
> object to tl->cpumask() function to prevent such mishap in the future.
>
> sd_numa_mask() now directly references "tl->numa_level" instead of
> relying on the global "sched_domains_curr_level" hack to index into
> sched_domains_numa_masks[].
>
Eh, of course I see this *after* looking at the v6 patch.
I tested this again for good measure, but given I only test this under
x86 and the changes with v6 are in s390/ppc, I didn't expect to see much
change :-)
Reviewed-by: Valentin Schneider <vschneid@...hat.com>
Tested-by: Valentin Schneider <vschneid@...hat.com>
Powered by blists - more mailing lists