lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Feb 2021 12:27:22 +0000
From:   "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     "valentin.schneider@....com" <valentin.schneider@....com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "mgorman@...e.de" <mgorman@...e.de>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "morten.rasmussen@....com" <morten.rasmussen@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxarm@...neuler.org" <linuxarm@...neuler.org>,
        "xuwei (O)" <xuwei5@...wei.com>,
        "Liguozhu (Kenneth)" <liguozhu@...ilicon.com>,
        "tiantao (H)" <tiantao6@...ilicon.com>,
        wanghuiqiang <wanghuiqiang@...wei.com>,
        "Zengtao (B)" <prime.zeng@...ilicon.com>,
        Jonathan Cameron <jonathan.cameron@...wei.com>,
        "guodong.xu@...aro.org" <guodong.xu@...aro.org>,
        Meelis Roos <mroos@...ux.ee>
Subject: RE: [PATCH v2] sched/topology: fix the issue groups don't span
 domain->span for NUMA diameter > 2



> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@...radead.org]
> Sent: Thursday, February 11, 2021 12:22 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>
> Cc: valentin.schneider@....com; vincent.guittot@...aro.org; mgorman@...e.de;
> mingo@...nel.org; dietmar.eggemann@....com; morten.rasmussen@....com;
> linux-kernel@...r.kernel.org; linuxarm@...neuler.org; xuwei (O)
> <xuwei5@...wei.com>; Liguozhu (Kenneth) <liguozhu@...ilicon.com>; tiantao (H)
> <tiantao6@...ilicon.com>; wanghuiqiang <wanghuiqiang@...wei.com>; Zengtao (B)
> <prime.zeng@...ilicon.com>; Jonathan Cameron <jonathan.cameron@...wei.com>;
> guodong.xu@...aro.org; Meelis Roos <mroos@...ux.ee>
> Subject: Re: [PATCH v2] sched/topology: fix the issue groups don't span
> domain->span for NUMA diameter > 2
> 
> On Tue, Feb 09, 2021 at 08:58:15PM +0000, Song Bao Hua (Barry Song) wrote:
> 
> > > I've finally had a moment to think about this, would it make sense to
> > > also break up group: node0+1, such that we then end up with 3 groups of
> > > equal size?
> >
> 
> > Since the sched_domain[n-1] of a part of node[m]'s siblings are able
> > to cover the whole span of sched_domain[n] of node[m], there is no
> > necessity to scan over all siblings of node[m], once sched_domain[n]
> > of node[m] has been covered, we can stop making more sched_groups. So
> > the number of sched_groups is small.
> >
> > So historically, the code has never tried to make sched_groups result
> > in equal size. And it permits the overlapping of local group and remote
> > groups.
> 
> Histrorically groups have (typically) always been the same size though.

This is probably true for other platforms. But unfortunately it has never
been true in my platform :-)

node   0   1   2   3 
  0:  10  12  20  22 
  1:  12  10  22  24 
  2:  20  22  10  12 
  3:  22  24  12  10

In case we have only two cpus in one numa. 

CPU0's domain-3 has no overflowed sched_group, but its first group
covers 0-5(node0-node2), the second group covers 4-7
(node2-node3):

[    0.802139] CPU0 attaching sched-domain(s):
[    0.802193]  domain-0: span=0-1 level=MC
[    0.802443]   groups: 0:{ span=0 cap=1013 }, 1:{ span=1 cap=979 }
[    0.802693]   domain-1: span=0-3 level=NUMA
[    0.802731]    groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 }
[    0.802811]    domain-2: span=0-5 level=NUMA
[    0.802829]     groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 }
[    0.802881] ERROR: groups don't span domain->span
[    0.803058]     domain-3: span=0-7 level=NUMA
[    0.803080]      groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 }


> 
> The reason I did ask is because when you get one large and a bunch of
> smaller groups, the load-balancing 'pull' is relatively smaller to the
> large groups.
> 
> That is, IIRC should_we_balance() ensures only 1 CPU out of the group
> continues the load-balancing pass. So if, for example, we have one group
> of 4 CPUs and one group of 2 CPUs, then the group of 2 CPUs will pull
> 1/2 times, while the group of 4 CPUs will pull 1/4 times.
> 
> By making sure all groups are of the same level, and thus of equal size,
> this doesn't happen.

As you can see, even if we give all groups of domain2 equal size
by breaking up both local_group and remote_groups,  we will get to
the same problem in domain-3. And what's more tricky is that
domain-3 has no problem of "groups don't span domain->span".

It seems we need to change both domain2 and domain3 then though
domain3 has no issue of "groups don't span domain->span".

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ