linux-kernel - Re: [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <038d0332-2146-4bda-adf6-03ef58dcc3b5@amd.com>
Date: Mon, 17 Mar 2025 21:52:29 +0530
From: Bharata B Rao <bharata@....com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 AneeshKumar.KizhakeVeetil@....com, Hasan.Maruf@....com,
 Jonathan.Cameron@...wei.com, Michael.Day@....com, akpm@...ux-foundation.org,
 dave.hansen@...el.com, david@...hat.com, feng.tang@...el.com,
 hannes@...xchg.org, honggyu.kim@...com, hughd@...gle.com,
 jhubbard@...dia.com, k.shutemov@...il.com, kbusch@...a.com,
 kmanaouil.dev@...il.com, leesuyeon0506@...il.com, leillc@...gle.com,
 liam.howlett@...cle.com, mgorman@...hsingularity.net, mingo@...hat.com,
 nadav.amit@...il.com, nphamcs@...il.com, peterz@...radead.org,
 raghavendra.kt@....com, riel@...riel.com, rientjes@...gle.com,
 rppt@...nel.org, shivankg@....com, shy828301@...il.com, sj@...nel.org,
 vbabka@...e.cz, weixugc@...gle.com, willy@...radead.org,
 ying.huang@...ux.alibaba.com, ziy@...dia.com, yuanchu@...gle.com
Subject: Re: [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and
 promotion daemon

On 17-Mar-25 8:35 PM, Gregory Price wrote:
> On Mon, Mar 17, 2025 at 09:09:18AM +0530, Bharata B Rao wrote:
>> On 13-Mar-25 10:14 PM, Davidlohr Bueso wrote:
>>> On Thu, 06 Mar 2025, Bharata B Rao wrote:
>>>
>>>> +static int page_should_be_promoted(struct page_hotness_info *phi)
>>>> +{
>>>> +    struct page *page = pfn_to_online_page(phi->pfn);
>>>> +    unsigned long now = jiffies;
>>>> +    struct folio *folio;
>>>> +
>>>> +    if (!page || is_zone_device_page(page))
>>>> +        return false;
>>>> +
>>>> +    folio = page_folio(page);
>>>> +    if (!folio_test_lru(folio)) {
>>>> +        count_vm_event(KPROMOTED_MIG_NON_LRU);
>>>> +        return false;
>>>> +    }
>>>> +    if (folio_nid(folio) == phi->hot_node) {
>>>> +        count_vm_event(KPROMOTED_MIG_RIGHT_NODE);
>>>> +        return false;
>>>> +    }
>>>
>>> How about using the LRU age itself:
>>
>> Sounds like a good check for page hotness.
>>
>>>
>>> if (folio_test_active())
>>>      return true;
>>
>> But the numbers I obtained with this check added, didn't really hit this
>> condition all that much. I was running a multi-threaded application that
>> allocates enough memory such that the allocation spills over from DRAM node
>> to the CXL node. Threads keep touching the memory pages in random order.
>>
> 
> Is demotion enabled by any chance?

Yes, I thought enabling demotion is required to create enough room in 
the toptier to handle promotion.

> 
> i.e. are you sure it's actually allocating from CXL and not demoting
> cold stuff to CXL?

But then I realized that spill over was caused by demotion rather than 
initial allocation even when I used MPOL_BIND | MPOL_F_NUMA_BALANCING 
policy with both toptier and CXL node in the nodemask.

> 
>> kpromoted_recorded_accesses 960620 /* Number of recorded accesses */
>> kpromoted_recorded_hwhints 960620  /* Nr accesses via HW hints, IBS in this
>> case */
>> kpromoted_recorded_pgtscans 0
>> kpromoted_record_toptier 638006 /* Nr toptier accesses */
>> kpromoted_record_added 321234 /* Nr (CXL) accesses that are tracked */
>> kpromoted_record_exists 1380
>> kpromoted_mig_right_node 0
>> kpromoted_mig_non_lru 226
>> kpromoted_mig_lru_active 47 /* Number of accesses considered for promotion
>> as determined by folio_test_active() check */

However disabling demotion has no impact on this number (and hence the 
folio_test_active() check)

Regards,
Bharata.