[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4977e67c975723c98a441e08cc9f001b69f5668e.camel@intel.com>
Date: Fri, 20 May 2022 14:44:26 +0800
From: Ying Huang <ying.huang@...el.com>
To: Mel Gorman <mgorman@...hsingularity.net>,
kernel test robot <oliver.sang@...el.com>
Cc: 0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
lkp@...ts.01.org, fengwei.yin@...el.com,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Valentin Schneider <valentin.schneider@....com>,
Aubrey Li <aubrey.li@...ux.intel.com>, yu.c.chen@...el.com
Subject: Re: [LKP] Re: [sched/numa] bb2dee337b: unixbench.score -11.2%
regression
On Thu, 2022-05-19 at 15:54 +0800, ying.huang@...el.com wrote:
> Hi, Mel,
>
> On Wed, 2022-05-18 at 16:22 +0100, Mel Gorman wrote:
> > On Wed, May 18, 2022 at 05:24:14PM +0800, kernel test robot wrote:
> > >
> > >
> > > Greeting,
> > >
> > > FYI, we noticed a -11.2% regression of unixbench.score due to commit:
> > >
> > >
> > > commit: bb2dee337bd7d314eb7c7627e1afd754f86566bc ("[PATCH 3/4] sched/numa: Apply imbalance limitations consistently")
> > > url: https://github.com/intel-lab-lkp/linux/commits/Mel-Gorman/Mitigate-inconsistent-NUMA-imbalance-behaviour/20220511-223233
> > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git d70522fc541224b8351ac26f4765f2c6268f8d72
> > > patch link: https://lore.kernel.org/lkml/20220511143038.4620-4-mgorman@techsingularity.net
> > >
> > > in testcase: unixbench
> > > on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz with 256G memory
> > > with following parameters:
> > >
> > > runtime: 300s
> > > nr_task: 1
> > > test: shell8
> > > cpufreq_governor: performance
> > > ucode: 0xd000331
> > >
> > > test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
> > > test-url: https://github.com/kdlucas/byte-unixbench
> >
> > I think what is happening for unixbench is that it prefers to run all
> > instances on a local node if possible. shell8 is creating 8 scripts,
> > each of which spawn more processes. The total number of tasks may exceed
> > the allowed imbalance at fork time of 16 tasks. Some spill over to a
> > remote node and as they are using files, some accesses are remote and it
> > suffers. It's not memory bandwidth bound but is sensitive to locality.
> > The stats somewhat support this idea
> >
> > > 83590 ± 13% -73.7% 21988 ± 32% numa-meminfo.node0.AnonHugePages
> > > 225657 ± 18% -58.0% 94847 ± 18% numa-meminfo.node0.AnonPages
> > > 231652 ± 17% -55.3% 103657 ± 16% numa-meminfo.node0.AnonPages.max
> > > 234525 ± 17% -55.5% 104341 ± 18% numa-meminfo.node0.Inactive
> > > 234397 ± 17% -55.5% 104267 ± 18% numa-meminfo.node0.Inactive(anon)
> > > 11724 ± 7% +17.5% 13781 ± 5% numa-meminfo.node0.KernelStack
> > > 4472 ± 34% +117.1% 9708 ± 31% numa-meminfo.node0.PageTables
> > > 15239 ± 75% +401.2% 76387 ± 10% numa-meminfo.node1.AnonHugePages
> > > 67256 ± 63% +206.3% 205994 ± 6% numa-meminfo.node1.AnonPages
> > > 73568 ± 58% +193.1% 215644 ± 6% numa-meminfo.node1.AnonPages.max
> > > 75737 ± 53% +183.9% 215053 ± 6% numa-meminfo.node1.Inactive
> > > 75709 ± 53% +183.9% 214971 ± 6% numa-meminfo.node1.Inactive(anon)
> > > 3559 ± 42% +187.1% 10216 ± 8% numa-meminfo.node1.PageTables
> >
> > There is less memory used on one node and more on the other so it's
> > getting split.
>
> This makes sense. I will also check CPU utilization per node to verify
> this directly.
I run this workload 3 times for the commit and its parent with mpstat
node statistics.
For the parent commit,
"mpstat.node.0.usr%": [
0.1396875,
3.0806153846153848,
0.05303030303030303
],
"mpstat.node.0.sys%": [
0.10515625,
5.597692307692308,
0.1340909090909091
],
"mpstat.node.1.usr%": [
3.1015625,
0.1306153846153846,
3.0275757575757574
],
"mpstat.node.1.sys%": [
5.66703125,
0.11676923076923076,
5.498181818181818
],
The difference between two nodes are quite large.
For the commit,
"mpstat.node.0.usr%": [
1.42109375,
1.4725,
1.5140625
],
"mpstat.node.0.sys%": [
3.00125,
3.16390625,
3.1284375
],
"mpstat.node.1.usr%": [
1.4909375,
1.41609375,
1.3740625
],
"mpstat.node.1.sys%": [
3.1671875,
3.00109375,
3.044375
],
The difference between 2 nodes reduces greatly. So this proves your
theory directly.
Best Regards,
Huang, Ying
[snip]
Powered by blists - more mailing lists