linux-kernel - Re: [PATCH v4 1/2] sched/topology: improve topology_span

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b1ff9a6d-4593-4120-b989-5a0fdba8329a@amd.com>
Date: Tue, 17 Jun 2025 08:34:53 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Steve Wahl <steve.wahl@....com>, Leon Romanovsky <leon@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 linux-kernel@...r.kernel.org, Vishal Chourasia <vishalc@...ux.ibm.com>,
 samir <samir@...ux.ibm.com>, Naman Jain <namjain@...ux.microsoft.com>,
 Saurabh Singh Sengar <ssengar@...ux.microsoft.com>, srivatsa@...il.mit.edu,
 Michael Kelley <mhklinux@...look.com>, Russ Anderson <rja@....com>,
 Dimitri Sivanich <sivanich@....com>
Subject: Re: [PATCH v4 1/2] sched/topology: improve topology_span_sane speed

Hello Steve,

On 6/16/2025 7:48 PM, Steve Wahl wrote:
> On Sun, Jun 15, 2025 at 09:42:07AM +0300, Leon Romanovsky wrote:
>> On Thu, Jun 12, 2025 at 04:11:52PM +0530, K Prateek Nayak wrote:
>>> On 6/12/2025 3:00 PM, K Prateek Nayak wrote:
>>>> Ah! Since this happens so early topology isn't created yet for
>>>> the debug prints to hit! Is it possible to get a dmesg with
>>>> "ignore_loglevel" and "sched_verbose" on an older kernel that
>>>> did not throw this error on the same host?
>>
>> This is dmesg with reverted two commits "ched/topology: Refinement to
>> topology_span_sane speedup" and "sched/topology: improve
>> topology_span_sane speed"
> 
> I would be interested in whether there's a difference with only the
> second patch being reverted.  The first patch is expected to get the
> exact same results as previous code, only faster.  The second had
> simplifications suggested by others that could give different results
> under conditions that were not expected to exist.  The commit message
> for the second patch explains this.

Since NUMA domains are skipped as a result of SD_OVERLAP, the remaining
PKG domains don't show any discrepancy that would fail the current
check:

     CPU0 attaching sched-domain(s):
      domain-0: span=0-1 level=PKG               id:0    span:0-1
       groups: 0:{ span=0 }, 1:{ span=1 }
     CPU1 attaching sched-domain(s):
      domain-0: span=0-1 level=PKG               id:0    span:0-1
       groups: 1:{ span=1 }, 0:{ span=0 }
     CPU2 attaching sched-domain(s):
      domain-0: span=2-3 level=PKG               id:2    span:2-3
       groups: 2:{ span=2 }, 3:{ span=3 }
     CPU3 attaching sched-domain(s):
      domain-0: span=2-3 level=PKG               id:2    span:2-3
       groups: 3:{ span=3 }, 2:{ span=2 }
     CPU4 attaching sched-domain(s):
      domain-0: span=4-5 level=PKG               id:4    span:4-5
       groups: 4:{ span=4 }, 5:{ span=5 }
     CPU5 attaching sched-domain(s):
      domain-0: span=4-5 level=PKG               id:4    span:4-5
       groups: 5:{ span=5 }, 4:{ span=4 }
     CPU6 attaching sched-domain(s):
      domain-0: span=6-7 level=PKG               id:6    span:6-7
       groups: 6:{ span=6 }, 7:{ span=7 }
     CPU7 attaching sched-domain(s):
      domain-0: span=6-7 level=PKG               id:6    span:6-7
       groups: 7:{ span=7 }, 6:{ span=6 }
     CPU8 attaching sched-domain(s):
      domain-0: span=8-9 level=PKG               id:8    span:8-9
       groups: 8:{ span=8 }, 9:{ span=9 }
     CPU9 attaching sched-domain(s):
      domain-0: span=8-9 level=PKG               id:8    span:8-9
       groups: 9:{ span=9 }, 8:{ span=8 }

I suspect a topology level that gets degenerated for the failed check
but looking at the degeneration path, the degenerated domains should
either have a single CPU in it (SMT,CLS,MC) or it should have the
same span as PKG (NODE domain) for it to degenerate which should be
sane.

Leon, could you also paste the output of numactl -H from within the
guest please. I'm wondering if the NUMA topology makes a difference
here somehow.

-- 
Thanks and Regards,
Prateek