lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251006105346.00004634@huawei.com>
Date: Mon, 6 Oct 2025 10:53:46 +0100
From: Jonathan Cameron <jonathan.cameron@...wei.com>
To: Bharata B Rao <bharata@....com>
CC: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<dave.hansen@...el.com>, <gourry@...rry.net>, <hannes@...xchg.org>,
	<mgorman@...hsingularity.net>, <mingo@...hat.com>, <peterz@...radead.org>,
	<raghavendra.kt@....com>, <riel@...riel.com>, <rientjes@...gle.com>,
	<sj@...nel.org>, <weixugc@...gle.com>, <willy@...radead.org>,
	<ying.huang@...ux.alibaba.com>, <ziy@...dia.com>, <dave@...olabs.net>,
	<nifan.cxl@...il.com>, <xuezhengchu@...wei.com>, <yiannis@...corp.com>,
	<akpm@...ux-foundation.org>, <david@...hat.com>, <byungchul@...com>,
	<kinseyho@...gle.com>, <joshua.hahnjy@...il.com>, <yuanchu@...gle.com>,
	<balbirs@...dia.com>, <alok.rathore@...sung.com>
Subject: Re: [RFC PATCH v2 8/8] mm: sched: Move hot page promotion from
 NUMAB=2 to kpromoted

On Mon, 6 Oct 2025 11:27:21 +0530
Bharata B Rao <bharata@....com> wrote:

> On 03-Oct-25 6:08 PM, Jonathan Cameron wrote:
> > On Wed, 10 Sep 2025 20:16:53 +0530
> > Bharata B Rao <bharata@....com> wrote:
> >   
> >> Currently hot page promotion (NUMA_BALANCING_MEMORY_TIERING
> >> mode of NUMA Balancing) does hot page detection (via hint faults),
> >> hot page classification and eventual promotion, all by itself and
> >> sits within the scheduler.
> >>
> >> With the new hot page tracking and promotion mechanism being
> >> available, NUMA Balancing can limit itself to detection of
> >> hot pages (via hint faults) and off-load rest of the
> >> functionality to the common hot page tracking system.
> >>
> >> pghot_record_access(PGHOT_HINT_FAULT) API is used to feed the
> >> hot page info. In addition, the migration rate limiting and
> >> dynamic threshold logic are moved to kpromoted so that the same
> >> can be used for hot pages reported by other sources too.
> >>
> >> Signed-off-by: Bharata B Rao <bharata@....com>  
> > 
> > Making a direct replacement without any fallback to previous method
> > is going to need a lot of data to show there are no important regressions.
> > 
> > So bold move if that's the intent!   
> 
> Firstly I am only moving the existing hot page heuristics that is part of
> NUMAB=2 to kpromoted so that the same can be applied to hot pages being
> identified by other sources. So the hint fault mechanism that is inherent
> to NUMAB=2 still remains.

That makes sense.

> 
> In fact, kscand effort started as a potential replacement for the existing
> hot page promotion mechanism by getting rid of hint faults and moving the
> page table scanning out of process context.

Understood and I'm in favor of the that approach but not sure it will be
a fit for all workloads.

> 
> In any case, I will start including numbers from the next post.

Great.

> >>  
> >>  static unsigned int sysctl_pghot_freq_window = KPROMOTED_FREQ_WINDOW;
> >>  
> >> +/* Restrict the NUMA promotion throughput (MB/s) for each target node. */
> >> +static unsigned int sysctl_pghot_promote_rate_limit = 65536;  
> > 
> > If the comment correlates with the value, this is 64 GiB/s?  That seems
> > unlikely if I guess possible.  
> 
> IIUC, the existing logic tries to limit promotion rate to 64 GiB/s by
> limiting the number of candidate pages that are promoted within the
> 1s observation interval.
> 
> Are you saying that achieving the rate of 64 GiB/s is not possible
> or unlikely?

Seem rather too high to me, but maybe I just have the wrong mental model
of what we should be moving. 
> 
> >   
> >> +
> >>  #ifdef CONFIG_SYSCTL
> >>  static const struct ctl_table pghot_sysctls[] = {
> >>  	{
> >> @@ -44,8 +50,17 @@ static const struct ctl_table pghot_sysctls[] = {
> >>  		.proc_handler	= proc_dointvec_minmax,
> >>  		.extra1		= SYSCTL_ZERO,
> >>  	},
> >> +	{
> >> +		.procname	= "pghot_promote_rate_limit_MBps",
> >> +		.data		= &sysctl_pghot_promote_rate_limit,
> >> +		.maxlen		= sizeof(unsigned int),
> >> +		.mode		= 0644,
> >> +		.proc_handler	= proc_dointvec_minmax,
> >> +		.extra1		= SYSCTL_ZERO,
> >> +	},
> >>  };
> >>  #endif
> >> +  
> > Put that in earlier patch to reduce noise here.  
> 
> This patch moves the hot page heuristics to kpromoted and hence this
> related sysctl is also being moved in this patch.

I just mean the blank line - not the block above.
This is just a patch set tidying up comment.

Jonathan



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ