linux-kernel - Re: [RFC v3 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y0z2jiom.fsf@DESKTOP-5N7EMDA>
Date: Thu, 23 Jan 2025 11:46:49 +0800
From: "Huang, Ying" <ying.huang@...ux.alibaba.com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-mm@...ck.org,  linux-doc@...r.kernel.org,
  linux-kernel@...r.kernel.org,  kernel-team@...a.com,
  nehagholkar@...a.com,  abhishekd@...a.com,  david@...hat.com,
  nphamcs@...il.com,  akpm@...ux-foundation.org,  hannes@...xchg.org,
  kbusch@...a.com,  feng.tang@...el.com,  donettom@...ux.ibm.com
Subject: Re: [RFC v3 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

Gregory Price <gourry@...rry.net> writes:

> On Wed, Jan 22, 2025 at 07:16:03PM +0800, Huang, Ying wrote:
>> Hi, Gregory,
>> > Test process:
>> >    In each test, we do a linear read of a 128GB file into a buffer
>> >    in a loop.
>> 
>> IMHO, the linear reading isn't a very good test case for promotion.  You
>> cannot test the hot-page selection algorithm.  I think that it's better
>> to use something like normal accessing pattern.  IIRC, it is available
>> in fio test suite.
>>
>
> Oh yes, I don't plan to drop RFC until I can get a real workload and
> probably fio running under this.  This patch set is varying priority for
> me at the moment so the versions will take some time.  My goal is to
> have something a bit more solid by LSF/MM, but not before.

No problem.

>> >    1) file allocated in DRAM with mechanisms off
>> >    2) file allocated in DRAM with balancing on but promotion off
>> >    3) file allocated in DRAM with balancing and promotion on
>> >       (promotion check is negative because all pages are top tier)
>> >    4) file allocated in CXL with mechanisms off
>> >    5) file allocated in CXL with mechanisms on
>> >
>> > |     1     |    2     |     3       |    4     |      5         |
>> > | DRAM Base | Promo On | TopTier Chk | CXL Base | Post-Promotion |
>> > |  7.5804   |  7.7586  |   7.9726    |   9.75   |    7.8941      |
>> 
>> For 3, we can check whether the folio is in top-tier as the first step.
>> Will that introduce measurable overhead?
>>
>
> That is basically what 2 vs 3 is doing.
>
> Test 2 shows overhead of TPP on + pagecache promo off
> Test 3 shows overhead of TPP+Promo on, but all the memory is on top tier
>
> This shows the check as to whether the folio is in the top tier is
> actually somewhat expensive (~5% compared to baseline, ~2.7% compared to
> TPP-on Promo-off).

This is unexpected.  Can we try to optimize it?  For example, via using
a nodemask?  node_is_toptier() is used in the mapped pages promotion
too (1 vs. 2 above).  I guess that the optimization can reduce the
overhead there with measurable difference too.

> The goal of this linear, simple test is to isolate test behavior from
> the overhead - that makes it easy to test each individual variable (TPP,
> promo, top tier, etc) and see relative overheads.
>
> This basically gives us a reasonable floor/ceiling of expected overhead.
> If we see something wildly different than this during something like FIO
> or a real workload, then we'll know we missed something.
>
>> >
>> > This could be further limited by limiting the promotion rate via the
>> > existing knob, or by implementing a new knob detached from the existing
>> > promotion rate.  There are merits to both approach.
>> 
>> Have you tested with the existing knob?  Whether does it help?
>>
>
> Not yet, this fell off my priority list before I could do additional
> testing.  I will add that to my backlog.

No problem.

---
Best Regards,
Huang, Ying