[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y0z2jiom.fsf@DESKTOP-5N7EMDA>
Date: Thu, 23 Jan 2025 11:46:49 +0800
From: "Huang, Ying" <ying.huang@...ux.alibaba.com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com,
nehagholkar@...a.com, abhishekd@...a.com, david@...hat.com,
nphamcs@...il.com, akpm@...ux-foundation.org, hannes@...xchg.org,
kbusch@...a.com, feng.tang@...el.com, donettom@...ux.ibm.com
Subject: Re: [RFC v3 PATCH 0/5] Promotion of Unmapped Page Cache Folios.
Gregory Price <gourry@...rry.net> writes:
> On Wed, Jan 22, 2025 at 07:16:03PM +0800, Huang, Ying wrote:
>> Hi, Gregory,
>> > Test process:
>> > In each test, we do a linear read of a 128GB file into a buffer
>> > in a loop.
>>
>> IMHO, the linear reading isn't a very good test case for promotion. You
>> cannot test the hot-page selection algorithm. I think that it's better
>> to use something like normal accessing pattern. IIRC, it is available
>> in fio test suite.
>>
>
> Oh yes, I don't plan to drop RFC until I can get a real workload and
> probably fio running under this. This patch set is varying priority for
> me at the moment so the versions will take some time. My goal is to
> have something a bit more solid by LSF/MM, but not before.
No problem.
>> > 1) file allocated in DRAM with mechanisms off
>> > 2) file allocated in DRAM with balancing on but promotion off
>> > 3) file allocated in DRAM with balancing and promotion on
>> > (promotion check is negative because all pages are top tier)
>> > 4) file allocated in CXL with mechanisms off
>> > 5) file allocated in CXL with mechanisms on
>> >
>> > | 1 | 2 | 3 | 4 | 5 |
>> > | DRAM Base | Promo On | TopTier Chk | CXL Base | Post-Promotion |
>> > | 7.5804 | 7.7586 | 7.9726 | 9.75 | 7.8941 |
>>
>> For 3, we can check whether the folio is in top-tier as the first step.
>> Will that introduce measurable overhead?
>>
>
> That is basically what 2 vs 3 is doing.
>
> Test 2 shows overhead of TPP on + pagecache promo off
> Test 3 shows overhead of TPP+Promo on, but all the memory is on top tier
>
> This shows the check as to whether the folio is in the top tier is
> actually somewhat expensive (~5% compared to baseline, ~2.7% compared to
> TPP-on Promo-off).
This is unexpected. Can we try to optimize it? For example, via using
a nodemask? node_is_toptier() is used in the mapped pages promotion
too (1 vs. 2 above). I guess that the optimization can reduce the
overhead there with measurable difference too.
> The goal of this linear, simple test is to isolate test behavior from
> the overhead - that makes it easy to test each individual variable (TPP,
> promo, top tier, etc) and see relative overheads.
>
> This basically gives us a reasonable floor/ceiling of expected overhead.
> If we see something wildly different than this during something like FIO
> or a real workload, then we'll know we missed something.
>
>> >
>> > This could be further limited by limiting the promotion rate via the
>> > existing knob, or by implementing a new knob detached from the existing
>> > promotion rate. There are merits to both approach.
>>
>> Have you tested with the existing knob? Whether does it help?
>>
>
> Not yet, this fell off my priority list before I could do additional
> testing. I will add that to my backlog.
No problem.
---
Best Regards,
Huang, Ying
Powered by blists - more mailing lists