[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZWBIUa92yQFaQ/kM@xsang-OptiPlex-9020>
Date: Fri, 24 Nov 2023 14:53:05 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: "Huang, Ying" <ying.huang@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Vlastimil Babka <vbabka@...e.cz>,
"David Hildenbrand" <david@...hat.com>,
Johannes Weiner <jweiner@...hat.com>,
"Dave Hansen" <dave.hansen@...ux.intel.com>,
Michal Hocko <mhocko@...e.com>,
"Pavel Tatashin" <pasha.tatashin@...een.com>,
Matthew Wilcox <willy@...radead.org>,
Christoph Lameter <cl@...ux.com>,
Arjan van de Ven <arjan@...ux.intel.com>,
Sudeep Holla <sudeep.holla@....com>, <linux-mm@...ck.org>,
<feng.tang@...el.com>, <fengwei.yin@...el.com>,
<oliver.sang@...el.com>
Subject: Re: [linus:master] [mm, pcp] 6ccdcb6d3a: stress-ng.judy.ops_per_sec
-4.7% regression
hi, Huang Ying,
On Thu, Nov 23, 2023 at 01:40:02PM +0800, Huang, Ying wrote:
> Hi,
>
> Thanks for test!
>
> kernel test robot <oliver.sang@...el.com> writes:
>
> > Hello,
> >
> > kernel test robot noticed a -4.7% regression of stress-ng.judy.ops_per_sec on:
> >
> >
> > commit: 6ccdcb6d3a741c4e005ca6ffd4a62ddf8b5bead3 ("mm, pcp: reduce detecting time of consecutive high order page freeing")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > testcase: stress-ng
> > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> > parameters:
> >
> > nr_threads: 100%
> > testtime: 60s
> > class: cpu-cache
> > test: judy
> > disk: 1SSD
> > cpufreq_governor: performance
> >
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+-------------------------------------------------------------------------------------------------+
> > | testcase: change | lmbench3: lmbench3.TCP.socket.bandwidth.10MB.MB/sec 23.7% improvement |
> > | test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory |
> > | test parameters | cpufreq_governor=performance |
> > | | mode=development |
> > | | nr_threads=100% |
> > | | test=TCP |
> > | | test_memory_size=50% |
> > +------------------+-------------------------------------------------------------------------------------------------+
> > | testcase: change | stress-ng: stress-ng.file-ioctl.ops_per_sec -6.6% regression |
> > | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory |
> > | test parameters | class=filesystem |
> > | | cpufreq_governor=performance |
> > | | disk=1SSD |
> > | | fs=btrfs |
> > | | nr_threads=10% |
> > | | test=file-ioctl |
> > | | testtime=60s |
> > +------------------+-------------------------------------------------------------------------------------------------+
>
> It's expected that this commit will benefit some workload (mainly
> network, inter-process communication related) and hurt some workload.
> But the whole series should have no much regression. Can you try the
> whole series for the regression test cases? The series start from
> commit ca71fe1ad922 ("mm, pcp: avoid to drain PCP when process exit") to
> commit 6ccdcb6d3a74 ("mm, pcp: reduce detecting time of consecutive high
> order page freeing").
since:
* 6ccdcb6d3a741 mm, pcp: reduce detecting time of consecutive high order page freeing
* 57c0419c5f0ea mm, pcp: decrease PCP high if free pages < high watermark
* 51a755c56dc05 mm: tune PCP high automatically
* 90b41691b9881 mm: add framework for PCP high auto-tuning
* c0a242394cb98 mm, page_alloc: scale the number of pages that are batch allocated
* 52166607ecc98 mm: restrict the pcp batch scale factor to avoid too long latency
* 362d37a106dd3 mm, pcp: reduce lock contention for draining high-order pages
* 94a3bfe4073cd cacheinfo: calculate size of per-CPU data cache slice
* ca71fe1ad9221 mm, pcp: avoid to drain PCP when process exit
* 1f4f7f0f8845d mm/oom_killer: simplify OOM killer info dump helper
I tested 1f4f7f0f8845d vs 6ccdcb6d3a741.
for stress-ng.judy.ops_per_sec, there is a smaller regression (-2.0%):
(full comparison is attached as ncompare-judy)
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
cpu-cache/gcc-12/performance/1SSD/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp4/judy/stress-ng/60s
1f4f7f0f8845dbac 6ccdcb6d3a741c4e005ca6ffd4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
6925490 -0.9% 6862477 stress-ng.judy.Judy_delete_operations_per_sec
22515488 -0.4% 22420191 stress-ng.judy.Judy_find_operations_per_sec
9036524 -3.9% 8685310 ± 3% stress-ng.judy.Judy_insert_operations_per_sec
171299 -2.0% 167905 stress-ng.judy.ops
2853 -2.0% 2796 stress-ng.judy.ops_per_sec
for stress-ng.file-ioctl.ops_per_sec, there is a similar regression (-6.9%):
(full comparison is attached as ncompare-file-ioctl)
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
filesystem/gcc-12/performance/1SSD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/file-ioctl/stress-ng/60s
1f4f7f0f8845dbac 6ccdcb6d3a741c4e005ca6ffd4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
340971 -6.9% 317411 stress-ng.file-ioctl.ops
5682 -6.9% 5290 stress-ng.file-ioctl.ops_per_sec
>
> --
> Best Regards,
> Huang, Ying
>
View attachment "ncompare-judy" of type "text/plain" (182481 bytes)
View attachment "ncompare-file-ioctl" of type "text/plain" (207252 bytes)
Powered by blists - more mailing lists