[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <038d0332-2146-4bda-adf6-03ef58dcc3b5@amd.com>
Date: Mon, 17 Mar 2025 21:52:29 +0530
From: Bharata B Rao <bharata@....com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
AneeshKumar.KizhakeVeetil@....com, Hasan.Maruf@....com,
Jonathan.Cameron@...wei.com, Michael.Day@....com, akpm@...ux-foundation.org,
dave.hansen@...el.com, david@...hat.com, feng.tang@...el.com,
hannes@...xchg.org, honggyu.kim@...com, hughd@...gle.com,
jhubbard@...dia.com, k.shutemov@...il.com, kbusch@...a.com,
kmanaouil.dev@...il.com, leesuyeon0506@...il.com, leillc@...gle.com,
liam.howlett@...cle.com, mgorman@...hsingularity.net, mingo@...hat.com,
nadav.amit@...il.com, nphamcs@...il.com, peterz@...radead.org,
raghavendra.kt@....com, riel@...riel.com, rientjes@...gle.com,
rppt@...nel.org, shivankg@....com, shy828301@...il.com, sj@...nel.org,
vbabka@...e.cz, weixugc@...gle.com, willy@...radead.org,
ying.huang@...ux.alibaba.com, ziy@...dia.com, yuanchu@...gle.com
Subject: Re: [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and
promotion daemon
On 17-Mar-25 8:35 PM, Gregory Price wrote:
> On Mon, Mar 17, 2025 at 09:09:18AM +0530, Bharata B Rao wrote:
>> On 13-Mar-25 10:14 PM, Davidlohr Bueso wrote:
>>> On Thu, 06 Mar 2025, Bharata B Rao wrote:
>>>
>>>> +static int page_should_be_promoted(struct page_hotness_info *phi)
>>>> +{
>>>> + struct page *page = pfn_to_online_page(phi->pfn);
>>>> + unsigned long now = jiffies;
>>>> + struct folio *folio;
>>>> +
>>>> + if (!page || is_zone_device_page(page))
>>>> + return false;
>>>> +
>>>> + folio = page_folio(page);
>>>> + if (!folio_test_lru(folio)) {
>>>> + count_vm_event(KPROMOTED_MIG_NON_LRU);
>>>> + return false;
>>>> + }
>>>> + if (folio_nid(folio) == phi->hot_node) {
>>>> + count_vm_event(KPROMOTED_MIG_RIGHT_NODE);
>>>> + return false;
>>>> + }
>>>
>>> How about using the LRU age itself:
>>
>> Sounds like a good check for page hotness.
>>
>>>
>>> if (folio_test_active())
>>> return true;
>>
>> But the numbers I obtained with this check added, didn't really hit this
>> condition all that much. I was running a multi-threaded application that
>> allocates enough memory such that the allocation spills over from DRAM node
>> to the CXL node. Threads keep touching the memory pages in random order.
>>
>
> Is demotion enabled by any chance?
Yes, I thought enabling demotion is required to create enough room in
the toptier to handle promotion.
>
> i.e. are you sure it's actually allocating from CXL and not demoting
> cold stuff to CXL?
But then I realized that spill over was caused by demotion rather than
initial allocation even when I used MPOL_BIND | MPOL_F_NUMA_BALANCING
policy with both toptier and CXL node in the nodemask.
>
>> kpromoted_recorded_accesses 960620 /* Number of recorded accesses */
>> kpromoted_recorded_hwhints 960620 /* Nr accesses via HW hints, IBS in this
>> case */
>> kpromoted_recorded_pgtscans 0
>> kpromoted_record_toptier 638006 /* Nr toptier accesses */
>> kpromoted_record_added 321234 /* Nr (CXL) accesses that are tracked */
>> kpromoted_record_exists 1380
>> kpromoted_mig_right_node 0
>> kpromoted_mig_non_lru 226
>> kpromoted_mig_lru_active 47 /* Number of accesses considered for promotion
>> as determined by folio_test_active() check */
However disabling demotion has no impact on this number (and hence the
folio_test_active() check)
Regards,
Bharata.
Powered by blists - more mailing lists