linux-kernel - Re: [RFC v2 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87v7v5g99x.fsf@DESKTOP-5N7EMDA>
Date: Fri, 27 Dec 2024 10:16:42 +0800
From: "Huang, Ying" <ying.huang@...ux.alibaba.com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-mm@...ck.org,  linux-kernel@...r.kernel.org,
  nehagholkar@...a.com,  abhishekd@...a.com,  kernel-team@...a.com,
  david@...hat.com,  nphamcs@...il.com,  akpm@...ux-foundation.org,
  hannes@...xchg.org,  kbusch@...a.com
Subject: Re: [RFC v2 PATCH 0/5] Promotion of Unmapped Page Cache Folios.

Gregory Price <gourry@...rry.net> writes:

> On Sun, Dec 22, 2024 at 03:09:44PM +0800, Huang, Ying wrote:
>> Gregory Price <gourry@...rry.net> writes:
>> > That's 3-6% performance in this contrived case.
>> 
>> This is small too.
>>
>
> Small is relative.  3-6% performance increase across millions of servers
> across a year is a non trivial speedup for such a common operation.

If we cannot only get 3-6% performance increase in a micro-benchmark,
how much can we get from a real life workloads?

Anyway, we need to prove the usefulness of the change via data.  3-6%
isn't some strong data.

Can we measure the largest improvement?  For example, run the benchmark
with all file pages in DRAM and CXL.mem via numa binding, and compare.

>> > Can easily piggyback on that, just wasn't sure if overloading it was
>> > an acceptable idea.
>> 
>> It's the recommended setup in the original PMEM promotion
>> implementation.  Please check commit c959924b0dc5 ("memory tiering:
>> adjust hot threshold automatically").
>> 
>> > Although since that promotion rate limit is also
>> > per-task (as far as I know, will need to read into it a bit more) this
>> > is probably fine.
>> 
>> It's not per-task.  Please read the code, especially
>> should_numa_migrate_memory().
>
> Oh, then this is already throttled.  We call mpol_misplaced which calls
> should_numa_migrate_memory. 
>
> There's some duplication of candidate selection logic between
> promotion_candidate and should_numa_migrate_memory, but it may be
> beneficial to keep it that way.  I'll have to look.

---
Best Regards,
Huang, Ying