lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 10 Feb 2021 09:35:00 +0000
From:   "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To:     Meelis Roos <mroos@...ux.ee>,
        "valentin.schneider@....com" <valentin.schneider@....com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "mgorman@...e.de" <mgorman@...e.de>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "morten.rasmussen@....com" <morten.rasmussen@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     "linuxarm@...neuler.org" <linuxarm@...neuler.org>,
        "xuwei (O)" <xuwei5@...wei.com>,
        "Liguozhu (Kenneth)" <liguozhu@...ilicon.com>,
        "tiantao (H)" <tiantao6@...ilicon.com>,
        wanghuiqiang <wanghuiqiang@...wei.com>,
        "Zengtao (B)" <prime.zeng@...ilicon.com>,
        Jonathan Cameron <jonathan.cameron@...wei.com>,
        "guodong.xu@...aro.org" <guodong.xu@...aro.org>
Subject: RE: [PATCH v3] sched/topology: fix the issue groups don't span
 domain->span for NUMA diameter > 2



> -----Original Message-----
> From: Meelis Roos [mailto:mroos@...ux.ee]
> Sent: Wednesday, February 10, 2021 1:40 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>;
> valentin.schneider@....com; vincent.guittot@...aro.org; mgorman@...e.de;
> mingo@...nel.org; peterz@...radead.org; dietmar.eggemann@....com;
> morten.rasmussen@....com; linux-kernel@...r.kernel.org
> Cc: linuxarm@...neuler.org; xuwei (O) <xuwei5@...wei.com>; Liguozhu (Kenneth)
> <liguozhu@...ilicon.com>; tiantao (H) <tiantao6@...ilicon.com>; wanghuiqiang
> <wanghuiqiang@...wei.com>; Zengtao (B) <prime.zeng@...ilicon.com>; Jonathan
> Cameron <jonathan.cameron@...wei.com>; guodong.xu@...aro.org
> Subject: Re: [PATCH v3] sched/topology: fix the issue groups don't span
> domain->span for NUMA diameter > 2
> 
> I did a rudimentary benchmark on the same 8-node Sun Fire X4600-M2, on top of
> todays  5.11.0-rc7-00002-ge0756cfc7d7c.
> 
> The test: building clean kernel with make -j64 after make clean and drop_caches.
> 
> While running clean kernel / 3 tries):
> 
> real    2m38.574s
> user    46m18.387s
> sys     6m8.724s
> 
> real    2m37.647s
> user    46m34.171s
> sys     6m11.993s
> 
> real    2m37.832s
> user    46m34.910s
> sys     6m12.013s
> 
> 
> While running patched kernel:
> 
> real    2m40.072s
> user    46m22.610s
> sys     6m6.658s
> 
> 
> for real time, seems to be 1.5s-2s slower out of 160s (noise?) User and system
> time are slightly less, on the other hand, so seems good to me.

I ran the same test on the machine with the below topology:
numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0-31
node 0 size: 64144 MB
node 0 free: 62356 MB
node 1 cpus: 32-63
node 1 size: 64509 MB
node 1 free: 62996 MB
node 2 cpus: 64-95
node 2 size: 64509 MB
node 2 free: 63020 MB
node 3 cpus: 96-127
node 3 size: 63991 MB
node 3 free: 62647 MB
node distances:
node   0   1   2   3 
  0:  10  12  20  22 
  1:  12  10  22  24 
  2:  20  22  10  12 
  3:  22  24  12  10

Basically the influence to kernel build is noise by
the commands I ran a couple of rounds:

make clean
echo 3 > /proc/sys/vm/drop_caches
make Image -j100

w/ patch:               w/o patch:

real	1m17.644s          real	1m19.510s
user	32m12.074s         user	32m14.133s
sys	4m35.827s           sys	4m38.198s

real	1m15.855s          real	1m17.303s
user	32m7.700s          user	32m14.128s
sys	4m35.868s           sys	4m40.094s

real	1m18.918s          real	1m19.583s
user	32m13.352s         user	32m13.205s
sys	4m40.161s           sys	4m40.696s

real	1m20.329s          real	1m17.819s
user	32m7.255s          user	32m11.753s
sys	4m36.706s           sys	4m41.371s

real	1m17.773s          real	1m16.763s
user	32m19.912s         user	32m15.607s
sys	4m36.989s           sys	4m41.297s

real	1m14.943s          real	1m18.551s
user	32m14.549s         user	32m18.521s
sys	4m38.670s           sys	4m41.392s

real	1m16.439s          real	1m18.154s
user	32m12.864s         user	32m14.540s
sys	4m39.424s           sys	4m40.364s

our team guys who used the 3-hops-fix patch to run unixbench
reported some data of unixbench score as below(3 rounds):

w/o patch:    w/ patch:
1228.6        1254.9
1231.4        1265.7
1226.1        1266.1

One interesting thing is that if we change the kernel to
disallow the below BALANCING flags for the last hop,
			sd->flags &= ~(SD_BALANCE_EXEC |
				       SD_BALANCE_FORK |
				       SD_WAKE_AFFINE);

We are seeing further increase of unixbench. So sounds like
those balancing shouldn't go that far. But it is a different
topic.

> 
> --
> Meelis Roos <mroos@...ux.ee>

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ