[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73c0ed52b665468cb0aa0086f85da60c@hisilicon.com>
Date: Mon, 8 Feb 2021 10:27:27 +0000
From: "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To: Valentin Schneider <valentin.schneider@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"mgorman@...e.de" <mgorman@...e.de>,
"mingo@...nel.org" <mingo@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"dietmar.eggemann@....com" <dietmar.eggemann@....com>,
"morten.rasmussen@....com" <morten.rasmussen@....com>,
"linuxarm@...neuler.org" <linuxarm@...neuler.org>,
"xuwei (O)" <xuwei5@...wei.com>,
"Liguozhu (Kenneth)" <liguozhu@...ilicon.com>,
"tiantao (H)" <tiantao6@...ilicon.com>,
wanghuiqiang <wanghuiqiang@...wei.com>,
"Zengtao (B)" <prime.zeng@...ilicon.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
"guodong.xu@...aro.org" <guodong.xu@...aro.org>,
Meelis Roos <mroos@...ux.ee>
Subject: RE: [RFC PATCH 2/2] Revert "sched/topology: Warn when NUMA diameter >
2"
> -----Original Message-----
> From: Valentin Schneider [mailto:valentin.schneider@....com]
> Sent: Thursday, February 4, 2021 4:55 AM
> To: linux-kernel@...r.kernel.org
> Cc: vincent.guittot@...aro.org; mgorman@...e.de; mingo@...nel.org;
> peterz@...radead.org; dietmar.eggemann@....com; morten.rasmussen@....com;
> linuxarm@...neuler.org; xuwei (O) <xuwei5@...wei.com>; Liguozhu (Kenneth)
> <liguozhu@...ilicon.com>; tiantao (H) <tiantao6@...ilicon.com>; wanghuiqiang
> <wanghuiqiang@...wei.com>; Zengtao (B) <prime.zeng@...ilicon.com>; Jonathan
> Cameron <jonathan.cameron@...wei.com>; guodong.xu@...aro.org; Song Bao Hua
> (Barry Song) <song.bao.hua@...ilicon.com>; Meelis Roos <mroos@...ux.ee>
> Subject: [RFC PATCH 2/2] Revert "sched/topology: Warn when NUMA diameter > 2"
>
> The scheduler topology code can now figure out what to do with such
> topologies.
>
> This reverts commit b5b217346de85ed1b03fdecd5c5076b34fbb2f0b.
>
> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
Yes, this is fine. I actually have seen some other problems we need
to consider.
The current code is probably well consolidated for machines with
2 hops or less. Thus, even after we fix the 3-hops span issue, I
can still see some other issue.
For example, if we change the sd flags and remove the SD_BALANCE
flags for the last hops in sd_init(), we are able to see large
score increase in unixbench.
if (sched_domains_numa_distance[tl->numa_level] > node_reclaim_distance ||
is_3rd_hops_domain(...)) {
sd->flags &= ~(SD_BALANCE_EXEC |
SD_BALANCE_FORK |
SD_WAKE_AFFINE);
}
So guess something needs to be tuned for machines with 3 hops or more.
But we need a kernel which has the fix of 3-hops issue before we can
do more work.
> ---
> kernel/sched/topology.c | 33 ---------------------------------
> 1 file changed, 33 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index a8f69f234258..0fa41aab74e0 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -688,7 +688,6 @@ cpu_attach_domain(struct sched_domain *sd, struct
> root_domain *rd, int cpu)
> {
> struct rq *rq = cpu_rq(cpu);
> struct sched_domain *tmp;
> - int numa_distance = 0;
>
> /* Remove the sched domains which do not contribute to scheduling. */
> for (tmp = sd; tmp; ) {
> @@ -720,38 +719,6 @@ cpu_attach_domain(struct sched_domain *sd, struct
> root_domain *rd, int cpu)
> sd->child = NULL;
> }
>
> - for (tmp = sd; tmp; tmp = tmp->parent)
> - numa_distance += !!(tmp->flags & SD_NUMA);
> -
> - /*
> - * FIXME: Diameter >=3 is misrepresented.
> - *
> - * Smallest diameter=3 topology is:
> - *
> - * node 0 1 2 3
> - * 0: 10 20 30 40
> - * 1: 20 10 20 30
> - * 2: 30 20 10 20
> - * 3: 40 30 20 10
> - *
> - * 0 --- 1 --- 2 --- 3
> - *
> - * NUMA-3 0-3 N/A N/A 0-3
> - * groups: {0-2},{1-3} {1-3},{0-2}
> - *
> - * NUMA-2 0-2 0-3 0-3 1-3
> - * groups: {0-1},{1-3} {0-2},{2-3} {1-3},{0-1} {2-3},{0-2}
> - *
> - * NUMA-1 0-1 0-2 1-3 2-3
> - * groups: {0},{1} {1},{2},{0} {2},{3},{1} {3},{2}
> - *
> - * NUMA-0 0 1 2 3
> - *
> - * The NUMA-2 groups for nodes 0 and 3 are obviously buggered, as the
> - * group span isn't a subset of the domain span.
> - */
> - WARN_ONCE(numa_distance > 2, "Shortest NUMA path spans too many nodes\n");
> -
> sched_domain_debug(sd, cpu);
>
> rq_attach_root(rq, rd);
> --
> 2.27.0
Thanks
Barry
Powered by blists - more mailing lists