[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tka0o+jn3UkXB+ZfZvRw1v+KysJbaGQvJdHcSmAhYC5TQA@mail.gmail.com>
Date: Mon, 22 Jan 2024 13:39:19 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, Johannes Weiner <hannes@...xchg.org>,
Domenico Cerasuolo <cerasuolodomenico@...il.com>, Shakeel Butt <shakeelb@...gle.com>,
Chris Li <chrisl@...nel.org>, Greg Thelen <gthelen@...gle.com>,
Ivan Babrou <ivan@...udflare.com>, Michal Hocko <mhocko@...nel.org>,
Michal Koutny <mkoutny@...e.com>, Muchun Song <muchun.song@...ux.dev>,
Roman Gushchin <roman.gushchin@...ux.dev>, Tejun Heo <tj@...nel.org>,
Waiman Long <longman@...hat.com>, Wei Xu <weixugc@...gle.com>, cgroups@...r.kernel.org,
linux-mm@...ck.org, ying.huang@...el.com, feng.tang@...el.com,
fengwei.yin@...el.com
Subject: Re: [linus:master] [mm] 8d59d2214c: vm-scalability.throughput -36.6% regression
On Mon, Jan 22, 2024 at 12:39 AM kernel test robot
<oliver.sang@...el.com> wrote:
>
>
>
> hi, Yosry Ahmed,
>
> per your suggestion in
> https://lore.kernel.org/all/CAJD7tkameJBrJQxRj+ibKL6-yd-i0wyoyv2cgZdh3ZepA1p7wA@mail.gmail.com/
> "I think it would be useful to know if there are
> regressions/improvements in other microbenchmarks, at least to
> investigate whether they represent real regressions."
>
> we still report below two regressions to you just FYI what we observed in our
> microbenchmark tests.
> (we still captured will-it-scale::fallocate regression but ignore here per
> your commit message)
>
>
> Hello,
>
> kernel test robot noticed a -36.6% regression of vm-scalability.throughput on:
>
>
> commit: 8d59d2214c2362e7a9d185d80b613e632581af7b ("mm: memcg: make stats flushing threshold per-memcg")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> testcase: vm-scalability
> test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
> parameters:
>
> runtime: 300s
> size: 1T
> test: lru-shm
> cpufreq_governor: performance
>
> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalabilitygit/
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+----------------------------------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops -32.3% regression |
> | test machine | 104 threads 2 sockets (Skylake) with 192G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=process |
> | | nr_task=50% |
> | | test=tlb_flush2 |
> +------------------+----------------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202401221624.cb53a8ca-oliver.sang@intel.com
Thanks for reporting this. We have had these patches running on O(10K)
machines in our production for a while now, and there haven't been any
complaints (at least not yet). OTOH, we do see significant CPU savings
on reading memcg stats.
That being said, I think we can improve the performance here by
caching pointers to the parent_memcg->vmstats_percpu and
memcg->vmstats in struct memcg_vmstat_percpu. This should
significantly reduce the memory fetches in the loop in
memcg_rstat_updated().
Oliver, would you be able to test if the attached patch helps? It's
based on 8d59d2214c236.
[..]
Download attachment "0001-mm-memcg-optimize-parent-iteration-in-memcg_rstat_up.patch" of type "application/octet-stream" (4006 bytes)
Powered by blists - more mailing lists