[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e12ec4f50c6c41db84f601038d3ee39c@hisilicon.com>
Date: Fri, 29 Jan 2021 02:02:58 +0000
From: "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To: Valentin Schneider <valentin.schneider@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: "mingo@...nel.org" <mingo@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"dietmar.eggemann@....com" <dietmar.eggemann@....com>,
"morten.rasmussen@....com" <morten.rasmussen@....com>,
"mgorman@...e.de" <mgorman@...e.de>
Subject: RE: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for
the deduplicating sort
> -----Original Message-----
> From: Valentin Schneider [mailto:valentin.schneider@....com]
> Sent: Friday, January 29, 2021 3:47 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>;
> linux-kernel@...r.kernel.org
> Cc: mingo@...nel.org; peterz@...radead.org; vincent.guittot@...aro.org;
> dietmar.eggemann@....com; morten.rasmussen@....com; mgorman@...e.de
> Subject: RE: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set
> for the deduplicating sort
>
> On 25/01/21 21:35, Song Bao Hua (Barry Song) wrote:
> > I was using 5.11-rc1. One thing I'd like to mention is that:
> >
> > For the below topology:
> > +-------+ +-----+
> > | node1 | 20 |node2|
> > | +----------+ |
> > +---+---+ +-----+
> > | |12
> > 12 | |
> > +---+---+ +---+-+
> > | | |node3|
> > | node0 | | |
> > +-------+ +-----+
> >
> > with node0-node2 as 22, node0-node3 as 24, node1-node3 as 22.
> >
> > I will get the below sched_domains_numa_distance[]:
> > 10, 12, 22, 24
> > As you can see there is *no* 20. So the node1 and node2 will
> > only get two-level numa sched_domain:
> >
>
>
> So that's
>
> -numa node,cpus=0-1,nodeid=0 -numa node,cpus=2-3,nodeid=1, \
> -numa node,cpus=4-5,nodeid=2, -numa node,cpus=6-7,nodeid=3, \
> -numa dist,src=0,dst=1,val=12, \
> -numa dist,src=0,dst=2,val=22, \
> -numa dist,src=0,dst=3,val=24, \
> -numa dist,src=1,dst=2,val=20, \
> -numa dist,src=1,dst=3,val=22, \
> -numa dist,src=2,dst=3,val=12
>
> but running this still doesn't get me a splat. Debugging
> sched_domains_numa_distance[] still gives me
> {10, 12, 20, 22, 24}
>
> >
> > But for the below topology:
> > +-------+ +-----+
> > | node0 | 20 |node2|
> > | +----------+ |
> > +---+---+ +-----+
> > | |12
> > 12 | |
> > +---+---+ +---+-+
> > | | |node3|
> > | node1 | | |
> > +-------+ +-----+
> >
> > with node1-node2 as 22, node1-node3 as 24,node0-node3 as 22.
> >
> > I will get the below sched_domains_numa_distance[]:
> > 10, 12, 20, 22, 24
> >
> > What I have seen is the performance will be better if we
> > drop the 20 as we will get a sched_domain hierarchy with less
> > levels, and two intermediate nodes won't have the group span
> > issue.
> >
>
> That is another thing that's worth considering. Morten was arguing that if
> the distance between two nodes is so tiny, it might not be worth
> representing it at all in the scheduler topology.
Yes. I agree it is a different thing. Anyway, I saw your patch has been
in sched tree. One side effect your patch is the one more sched_domain
level is imported for this topology:
24
X X XXX X X X X X X XXX
XX XX X XXXXX
XXX X
XX XXX
XX 22 XXX
X XXXXXXX XX
X XXXXX XXXXXXXXX XXXX
XX XXX XX X XX X XX
+--------+ +---------+ +---------+ XX+---------+
| 0 | 12 | 1 | 20 | 2 | 12 |3 |
| +-----------+ +----------+ +--------+ |
+---X----+ +---------+ +--X------+ +---------+
X X
XX X
X XX
XX XX
XX X
X XXX XXX
X XXXXXX XX XX X X X XXXX
22
Without the patch, Linux will use 10,12,22,24 to build sched_domain;
With your patch, Linux will use 10,12,20,22,24 to build sched_domain.
So one more layer is added. What I have seen is that:
For node0 sched_domain <=12 and sched_domain <=20 span the same range
(node0, node1). So one of them is redundant. then in cpu_attach_domain,
the redundant one is dropped due to "remove the sched domains which
do not contribute to scheduling".
For node1&2, the origin code had no "20", thus built one less sched_domain
level.
What is really interesting is that removing 20 actually gives better
benchmark in speccpu :-)
>
> > Thanks
> > Barry
Thanks
Barry
Powered by blists - more mailing lists