[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250325081832.209140-1-bharata@amd.com>
Date: Tue, 25 Mar 2025 13:48:32 +0530
From: Bharata B Rao <bharata@....com>
To: <bharata@....com>
CC: <AneeshKumar.KizhakeVeetil@....com>, <Hasan.Maruf@....com>,
<Jonathan.Cameron@...wei.com>, <Michael.Day@....com>,
<akpm@...ux-foundation.org>, <dave.hansen@...el.com>, <dave@...olabs.net>,
<david@...hat.com>, <feng.tang@...el.com>, <gourry@...rry.net>,
<hannes@...xchg.org>, <honggyu.kim@...com>, <hughd@...gle.com>,
<hyeonggon.yoo@...com>, <jhubbard@...dia.com>, <k.shutemov@...il.com>,
<kbusch@...a.com>, <kmanaouil.dev@...il.com>, <leesuyeon0506@...il.com>,
<leillc@...gle.com>, <liam.howlett@...cle.com>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<mgorman@...hsingularity.net>, <mingo@...hat.com>, <nadav.amit@...il.com>,
<nphamcs@...il.com>, <peterz@...radead.org>, <raghavendra.kt@....com>,
<riel@...riel.com>, <rientjes@...gle.com>, <rppt@...nel.org>,
<shivankg@....com>, <shy828301@...il.com>, <sj@...nel.org>, <vbabka@...e.cz>,
<weixugc@...gle.com>, <willy@...radead.org>, <ying.huang@...ux.alibaba.com>,
<yuanchu@...gle.com>, <ziy@...dia.com>
Subject: Re: [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages
> Hi,
>
> This is an attempt towards having a single subsystem that accumulates
> hot page information from lower memory tiers and does hot page
> promotion.
>
> At the heart of this subsystem is a kernel daemon named kpromoted that
> does the following:
>
> 1. Exposes an API that other subsystems which detect/generate memory
> access information can use to inform the daemon about memory
> accesses from lower memory tiers.
> 2. Maintains the list of hot pages and attempts to promote them to
> toptiers.
>
> Currently I have added AMD IBS driver as one source that provides
> page access information as an example. This driver feeds info to
> krpromoted in this RFC patchset.
FWIW, here are some numbers from krpomoted driven hotpage promotion with
IBS as the hotness source:
Test 1
======
Memory allocated on DRAM and CXL nodes explicitly and no demotion activity
is seen.
Benchmark details
-----------------
* Memory is allocated initially on DRAM and CXL nodes separately.
* Two threads: One accessing DRAM-allocated and other CXL-allocated memory.
* Divides memory area into regions and accesses pages within the region randomly
and repetitively. In the test config shown below, the allocated memory is
divided into regions of 1GB size and each such region is repetitively (512
times) accessed with 21474836480 random accesses in each repetition).
* Benchmark score is time taken for accesses to complete, lower is better
* Data accesses from CXL node are expected to trigger promotion
* Test system has 2 DRAM nodes (128G each) and a CXL node (128G)
kernel.numa_balancing 2 for base, 0 for kpromoted
demotion true
Threads run on Node 1
Memory allocated on Node 1(DRAM) and Node 2(CXL)
Initial allocation ratio 75% on DRAM
Allocated memory size 160G (mmap, MAP_POPULATE)
Initial memory on DRAM node 120G
Initial memory on CXL node 40G
Hot region size 1G
Acccess pattern random
Access granularity 4K
Load/store ratio 50% loads + 50% stores
Number of accesses 21474836480
Nr access repetitions 512
Benchmark completion time
-------------------------
Base, NUMAB=2 261s
kpromoted-ibs, NUMAB=0 281s
Stats comparision
-----------------
Base,NUMAB=2 kpromoted-IBS,NUMAB=0
pgdemote_kswapd 0 0
pgdemote_direct 0 0
numa_pte_updates 10485760 0
numa_hint_faults 4427809 0
numa_pages_migrated 388229 374765
kpromoted_recorded_accesses 1651130 /* nr accesses reported to kpromoted */
kpromoted_recorded_hwhints 1651130 /* nr accesses coming from IBS */
kpromoted_record_toptier 1269697 /* nr accesses from toptier/DRAM */
kpromoted_record_added 378090 /* nr accesses considered for promotion */
kpromoted_mig_promoted 374765 /* nr pages promoted */
hwhint_nr_events 1674227 /* nr events reported by IBS */
hwhint_dram_accesses 1269626 /* nr DRAM accesses reported by IBS */
hwhint_cxl_accesses 381435 /* nr Extmem (CXL) accesses reported by IBS */
hwhint_useful_samples 1651110 /* nr actionable samples as per IBS driver */
Test 2
======
Memory is allocated with DRAM and CXL nodes in the affinity mask with
MPOL_BIND + MPOL_F_NUMA_BALANCING.
Benchmark details
-----------------
* Initially, memory allocated spreads over from DRAM to CXL, involves demotion
* Single thread accesses the memory
* Divides memory area into regions and accesses pages within the region randomly
and repetitively. In the test config shown below, the allocated memory is
divided into regions of 1GB size and each such region is repetitively (512
times) accessed with 21474836480 random accesses in each repetition).
* Benchmark score is time taken for accesses to complete, lower is better
* Data accesses from CXL node are expected to trigger promotion
* Test system has 2 DRAM nodes (128G each) and a CXL node (128G)
kernel.numa_balancing 2 for base, 0 for kpromoted
demotion true
Threads run on Node 1
Memory allocated on Node 1(DRAM) and Node 2(CXL)
Allocated memory size 192G (mmap, MAP_POPULATE)
Hot region size 1G
Acccess pattern random
Access granularity 4K
Load/store ratio 50% loads + 50% stores
Number of accesses 21474836480
Nr access repetitions 512
Benchmark completion time
-------------------------
Base, NUMAB=2 628s
kpromoted-ibs, NUMAB=0 626s
Stats comparision
-----------------
Base,NUMAB=2 kpromoted-IBS,NUMAB=0
pgdemote_kswapd 73187 2196028
pgdemote_direct 0 0
numa_pte_updates 27511631 0
numa_hint_faults 10010852 0
numa_pages_migrated 14 611177 /* such low number of promotions is unexecpted in Base, Need to recheck */
kpromoted_recorded_accesses 1883570
kpromoted_recorded_hwhints 1883570
kpromoted_record_toptier 1262088
kpromoted_record_added 616273
kpromoted_mig_promoted 611077
hwhint_nr_events 1904619
hwhint_dram_accesses 1261758
hwhint_cxl_accesses 621428
hwhint_useful_samples 1883543
Powered by blists - more mailing lists