linux-kernel - Re: [linus:master] [mm] 8d59d2214c: vm-scalability.throughput -36.6% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Za9pB928KjSORPw+@xsang-OptiPlex-9020>
Date: Tue, 23 Jan 2024 15:21:43 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Yosry Ahmed <yosryahmed@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, Johannes Weiner
	<hannes@...xchg.org>, Domenico Cerasuolo <cerasuolodomenico@...il.com>,
	Shakeel Butt <shakeelb@...gle.com>, Chris Li <chrisl@...nel.org>, Greg Thelen
	<gthelen@...gle.com>, Ivan Babrou <ivan@...udflare.com>, Michal Hocko
	<mhocko@...nel.org>, Michal Koutny <mkoutny@...e.com>, Muchun Song
	<muchun.song@...ux.dev>, Roman Gushchin <roman.gushchin@...ux.dev>, Tejun Heo
	<tj@...nel.org>, Waiman Long <longman@...hat.com>, Wei Xu
	<weixugc@...gle.com>, <cgroups@...r.kernel.org>, <linux-mm@...ck.org>,
	<ying.huang@...el.com>, <feng.tang@...el.com>, <fengwei.yin@...el.com>,
	<oliver.sang@...el.com>
Subject: Re: [linus:master] [mm] 8d59d2214c: vm-scalability.throughput -36.6%
 regression

hi, Yosry Ahmed,

On Mon, Jan 22, 2024 at 01:39:19PM -0800, Yosry Ahmed wrote:
> On Mon, Jan 22, 2024 at 12:39 AM kernel test robot
> <oliver.sang@...el.com> wrote:
> >
> >
> >
> > hi, Yosry Ahmed,
> >
> > per your suggestion in
> > https://lore.kernel.org/all/CAJD7tkameJBrJQxRj+ibKL6-yd-i0wyoyv2cgZdh3ZepA1p7wA@mail.gmail.com/
> > "I think it would be useful to know if there are
> > regressions/improvements in other microbenchmarks, at least to
> > investigate whether they represent real regressions."
> >
> > we still report below two regressions to you just FYI what we observed in our
> > microbenchmark tests.
> > (we still captured will-it-scale::fallocate regression but ignore here per
> > your commit message)
> >
> >
> > Hello,
> >
> > kernel test robot noticed a -36.6% regression of vm-scalability.throughput on:
> >
> >
> > commit: 8d59d2214c2362e7a9d185d80b613e632581af7b ("mm: memcg: make stats flushing threshold per-memcg")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > testcase: vm-scalability
> > test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
> > parameters:
> >
> >         runtime: 300s
> >         size: 1T
> >         test: lru-shm
> >         cpufreq_governor: performance
> >
> > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+----------------------------------------------------------------------------------------------------+
> > | testcase: change | will-it-scale: will-it-scale.per_process_ops -32.3% regression                                     |
> > | test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                   |
> > | test parameters  | cpufreq_governor=performance                                                                       |
> > |                  | mode=process                                                                                       |
> > |                  | nr_task=50%                                                                                        |
> > |                  | test=tlb_flush2                                                                                    |
> > +------------------+----------------------------------------------------------------------------------------------------+
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202401221624.cb53a8ca-oliver.sang@intel.com
> 
> Thanks for reporting this. We have had these patches running on O(10K)
> machines in our production for a while now, and there haven't been any
> complaints (at least not yet). OTOH, we do see significant CPU savings
> on reading memcg stats.
> 
> That being said, I think we can improve the performance here by
> caching pointers to the parent_memcg->vmstats_percpu and
> memcg->vmstats in struct memcg_vmstat_percpu. This should
> significantly reduce the memory fetches in the loop in
> memcg_rstat_updated().
> 
> Oliver, would you be able to test if the attached patch helps? It's
> based on 8d59d2214c236.

the patch failed to compile:

build_errors:
  - "mm/memcontrol.c:731:38: error: 'x' undeclared (first use in this function)"


> 
> [..]