[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ye+dPmO17JN2bNLL@google.com>
Date: Mon, 24 Jan 2022 23:48:30 -0700
From: Yu Zhao <yuzhao@...gle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andi Kleen <ak@...ux.intel.com>,
Catalin Marinas <catalin.marinas@....com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
Jesse Barnes <jsbarnes@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Michael Larabel <Michael@...haellarabel.com>,
Michal Hocko <mhocko@...nel.org>,
Rik van Riel <riel@...riel.com>,
Vlastimil Babka <vbabka@...e.cz>,
Will Deacon <will@...nel.org>,
Ying Huang <ying.huang@...el.com>,
LAK <linux-arm-kernel@...ts.infradead.org>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, page-reclaim@...gle.com,
x86 <x86@...nel.org>
Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework
On Sun, Jan 23, 2022 at 06:43:06PM +1300, Barry Song wrote:
> On Wed, Jan 5, 2022 at 7:17 PM Yu Zhao <yuzhao@...gle.com> wrote:
<snipped>
> > Large-scale deployments
> > -----------------------
> > We've rolled out MGLRU to tens of millions of Chrome OS users and
> > about a million Android users. Google's fleetwide profiling [13] shows
> > an overall 40% decrease in kswapd CPU usage, in addition to
>
> Hi Yu,
>
> Was the overall 40% decrease of kswap CPU usgae seen on x86 or arm64?
> And I am curious how much we are taking advantage of NONLEAF_PMD_YOUNG.
> Does it help a lot in decreasing the cpu usage?
Hi Barry,
The fleet-wide profiling data I shared was from x86. For arm64, I only
have data from synthetic benchmarks at the moment, and it also shows
similar improvements.
For Chrome OS (individual users), walk_pte_range(), the function that
would benefit from ARCH_HAS_NONLEAF_PMD_YOUNG, only uses a small
portion (<4%) of kswapd CPU time. So ARCH_HAS_NONLEAF_PMD_YOUNG isn't
that helpful.
> If so, this might be
> a good proof that arm64 also needs this hardware feature?
> In short, I am curious how much the improvement in this patchset depends
> on the hardware ability of NONLEAF_PMD_YOUNG.
For data centers, I do think ARCH_HAS_NONLEAF_PMD_YOUNG has some value.
In addition to cold/hot memory scanning, there are other use cases like
dirty tracking, which can benefit from the accessed bit on non-leaf
entries. I know some proprietary software uses this capability on x86
for different purposes than this patchset does. And AFAIK, x86 is the
only arch that supports this capability, e.g., risc-v and ppc can only
set the accessed bit in PTEs.
In fact, I've discussed this with one of the arm maintainers Will. So
please check with him too if you are interested in moving forward with
the idea. I might be able to provide with additional data if you need
it to make a decision.
Thanks.
Powered by blists - more mailing lists