linux-kernel - Re: [RFC v3 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z5EhcQERseKShtGY@gourry-fedora-PF4VCD3F>
Date: Wed, 22 Jan 2025 11:48:49 -0500
From: Gregory Price <gourry@...rry.net>
To: "Huang, Ying" <ying.huang@...ux.alibaba.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel-team@...a.com,
	nehagholkar@...a.com, abhishekd@...a.com, david@...hat.com,
	nphamcs@...il.com, akpm@...ux-foundation.org, hannes@...xchg.org,
	kbusch@...a.com, feng.tang@...el.com, donettom@...ux.ibm.com
Subject: Re: [RFC v3 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

On Wed, Jan 22, 2025 at 07:16:03PM +0800, Huang, Ying wrote:
> Hi, Gregory,
> > Test process:
> >    In each test, we do a linear read of a 128GB file into a buffer
> >    in a loop.
> 
> IMHO, the linear reading isn't a very good test case for promotion.  You
> cannot test the hot-page selection algorithm.  I think that it's better
> to use something like normal accessing pattern.  IIRC, it is available
> in fio test suite.
>

Oh yes, I don't plan to drop RFC until I can get a real workload and
probably fio running under this.  This patch set is varying priority for
me at the moment so the versions will take some time.  My goal is to
have something a bit more solid by LSF/MM, but not before.

> >    1) file allocated in DRAM with mechanisms off
> >    2) file allocated in DRAM with balancing on but promotion off
> >    3) file allocated in DRAM with balancing and promotion on
> >       (promotion check is negative because all pages are top tier)
> >    4) file allocated in CXL with mechanisms off
> >    5) file allocated in CXL with mechanisms on
> >
> > |     1     |    2     |     3       |    4     |      5         |
> > | DRAM Base | Promo On | TopTier Chk | CXL Base | Post-Promotion |
> > |  7.5804   |  7.7586  |   7.9726    |   9.75   |    7.8941      |
> 
> For 3, we can check whether the folio is in top-tier as the first step.
> Will that introduce measurable overhead?
>

That is basically what 2 vs 3 is doing.

Test 2 shows overhead of TPP on + pagecache promo off
Test 3 shows overhead of TPP+Promo on, but all the memory is on top tier

This shows the check as to whether the folio is in the top tier is
actually somewhat expensive (~5% compared to baseline, ~2.7% compared to
TPP-on Promo-off).

The goal of this linear, simple test is to isolate test behavior from
the overhead - that makes it easy to test each individual variable (TPP,
promo, top tier, etc) and see relative overheads.

This basically gives us a reasonable floor/ceiling of expected overhead.
If we see something wildly different than this during something like FIO
or a real workload, then we'll know we missed something.

> >
> > This could be further limited by limiting the promotion rate via the
> > existing knob, or by implementing a new knob detached from the existing
> > promotion rate.  There are merits to both approach.
> 
> Have you tested with the existing knob?  Whether does it help?
>

Not yet, this fell off my priority list before I could do additional
testing.  I will add that to my backlog.

~Gregory