lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250205160529.GB1183495@cmpxchg.org>
Date: Wed, 5 Feb 2025 11:05:29 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Bharata B Rao <bharata@....com>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>,
	Raghavendra K T <raghavendra.kt@....com>, linux-mm@...ck.org,
	akpm@...ux-foundation.org, lsf-pc@...ts.linux-foundation.org,
	gourry@...rry.net, nehagholkar@...a.com, abhishekd@...a.com,
	ying.huang@...ux.alibaba.com, nphamcs@...il.com,
	feng.tang@...el.com, kbusch@...a.com, Hasan.Maruf@....com,
	sj@...nel.org, david@...hat.com, willy@...radead.org,
	k.shutemov@...il.com, mgorman@...hsingularity.net, vbabka@...e.cz,
	hughd@...gle.com, rientjes@...gle.com, shy828301@...il.com,
	liam.howlett@...cle.com, peterz@...radead.org, mingo@...hat.com,
	nadav.amit@...il.com, shivankg@....com, ziy@...dia.com,
	jhubbard@...dia.com, AneeshKumar.KizhakeVeetil@....com,
	linux-kernel@...r.kernel.org, jon.grimm@....com,
	santosh.shukla@....com, Michael.Day@....com, riel@...riel.com,
	weixugc@...gle.com, leesuyeon0506@...il.com, honggyu.kim@...com,
	leillc@...gle.com, kmanaouil.dev@...il.com, rppt@...nel.org,
	dave.hansen@...el.com
Subject: Re: [LSF/MM/BPF TOPIC] Unifying sources of page temperature
 information - what info is actually wanted?

On Wed, Feb 05, 2025 at 11:54:05AM +0530, Bharata B Rao wrote:
> On 31-Jan-25 6:39 PM, Jonathan Cameron wrote:
> > On Fri, 31 Jan 2025 12:28:03 +0000
> > Jonathan Cameron <Jonathan.Cameron@...wei.com> wrote:
> > 
> >>> Here is the list of potential discussion points:
> >> ...
> >>
> >>> 2. Possibility of maintaining single source of truth for page hotness that would
> >>> maintain hot page information from multiple sources and let other sub-systems
> >>> use that info.
> >> Hi,
> >>
> >> I was thinking of proposing a separate topic on a single source of hotness,
> >> but this question covers it so I'll add some thoughts here instead.
> >> I think we are very early, but sharing some experience and thoughts in a
> >> session may be useful.
> > 
> > Thinking more on this over lunch, I think it is worth calling this out as a
> > potential session topic in it's own right rather than trying to find
> > time within other sessions.  Hence the title change.
> > 
> > I think a session would start with a brief listing of the temperature sources
> > we have and those on the horizon to motivate what we are unifying, then
> > discussion to focus on need for such a unification + requirements
> > (maybe with a straw man).
> 
> Here is a compilation of available temperature sources and how the 
> hot/access data is consumed by different subsystems:

This is super useful, thanks for collecting this.

> PA-Physical address available
> VA-Virtual address available
> AA-Access time available
> NA-accessing Node info available
> 
> I have left the slot blank for those which I am not sure about.
> ==================================================
> Temperature		PA	VA	AA	NA
> source
> ==================================================
> PROT_NONE faults	Y	Y	Y	Y
> --------------------------------------------------
> folio_mark_accessed()	Y		Y	Y
> --------------------------------------------------

For fma(), the VA info is available in unmap, but usually it isn't -
or doesn't meaningfully exist, as in the case of unmapped buffered IO.

I'd say it's an N.

> PTE A bit		Y	Y	N	N
> --------------------------------------------------
> Platform hints		Y	Y	Y	Y
> (AMD IBS)
> --------------------------------------------------
> Device hints		Y
> (CXL HMU)
> ==================================================

For the following table, it might be useful to add *when* the source
produces this information. Sampling frequency is a likely challenge:
consumers have different requirements, and overhead should be limited
to the minimum required to serve enabled consumers.

Here is an (incomplete) attempt - sorry about the long lines:

> And here is an attempt to compile how different subsystems
> use the above data:
> ==============================================================
> Source			Subsystem		Consumption         Activation/Frequency
> ==============================================================
> PROT_NONE faults	NUMAB		NUMAB=1 locality based              While task is running,
> via process pgtable			balancing                           rate varies on observed
> walk					NUMAB=2 hot page                    locality and sysctl knobs.
> 					promotion
> ==============================================================
> folio_mark_accessed()	FS/filemap/GUP	LRU list activation                 On cache access and unmap
> ==============================================================
> PTE A bit via		Reclaim:LRU	LRU list activation,	            During memory pressure
> rmap walk				deactivation/demotion
> ==============================================================
> PTE A bit via		Reclaim:MGLRU	LRU list activation,	            - During memory pressure
> rmap walk and process			deactivation/demotion               - Continuous sampling (configurable)
> pgtable walk                                                                for workingset reporting
> ==============================================================
> PTE A bit via		DAMON		LRU activation,                     Continuous sampling (configurable)?
> rmap walk				hot page promotion,                 (I believe SJ is looking into
> 					demotion etc                         auto-tuning this).
> ==============================================================
> Platform hints		NUMAB		NUMAB=1 Locality based
> (AMD IBS)				balancing and
> 					NUMAB=2 hot page
> 					promotion
> ==============================================================
> Device hints		NUMAB		NUMAB=2 hot page
> 					promotion
> ==============================================================
> The last two are listed as possibilities.
> 
> Feel free to correct/clarify and add more.
> 
> Regards,
> Bharata.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ