lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9f9a61f-6798-42f4-a09e-dcdf54e0a649@amd.com>
Date: Tue, 18 Mar 2025 16:15:31 +0530
From: Bharata B Rao <bharata@....com>
To: SeongJae Park <sj@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 AneeshKumar.KizhakeVeetil@....com, Hasan.Maruf@....com,
 Jonathan.Cameron@...wei.com, Michael.Day@....com, akpm@...ux-foundation.org,
 dave.hansen@...el.com, david@...hat.com, feng.tang@...el.com,
 gourry@...rry.net, hannes@...xchg.org, honggyu.kim@...com, hughd@...gle.com,
 jhubbard@...dia.com, k.shutemov@...il.com, kbusch@...a.com,
 kmanaouil.dev@...il.com, leesuyeon0506@...il.com, leillc@...gle.com,
 liam.howlett@...cle.com, mgorman@...hsingularity.net, mingo@...hat.com,
 nadav.amit@...il.com, nphamcs@...il.com, peterz@...radead.org,
 raghavendra.kt@....com, riel@...riel.com, rientjes@...gle.com,
 rppt@...nel.org, shivankg@....com, shy828301@...il.com, vbabka@...e.cz,
 weixugc@...gle.com, willy@...radead.org, ying.huang@...ux.alibaba.com,
 ziy@...dia.com, dave@...olabs.net, yuanchu@...gle.com, hyeonggon.yoo@...com,
 Harry Yoo <harry.yoo@...cle.com>
Subject: Re: [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot
 pages

Hi SJ,

Thanks for your detailed points and this surely sets up a good context 
for discussion in LSFMM.

Please see my replies to a few of your questions below:

On 17-Mar-25 3:30 AM, SeongJae Park wrote:
>>
>> Currently I have added AMD IBS driver as one source that provides
>> page access information as an example. This driver feeds info to
>> kpromoted in this RFC patchset. More sources were discussed in a
>> similar context here at [1].
> 
> I was imagining how I would be able to do this with DAMON via operations set
> layer interface.  And I find thee current interface is not very optimized for
> AMD IBS like sources that catches the access on the line.  That is, in a way,
> we could say AMD IBS like primitives as push-oriented, while page tables'
> accessed bits information are pull-oriented.  DAMON operations set layer
> interface is easier to be used in pull-oriented case.  I don't think it cannot
> be used for push-oriented case, but definitely the interface would better to be
> more optimized for the use case.
> 
> I'm curious if you also tried doing this by extending DAMON, and if some hidden
> problems you found.

I remember discussing this with you during DAMON BoF in one of the 
earlier LPC events, but I didn't get to try it. Guess now is the time :-)

I see the challenge with the current DAMON interfaces to integrate IBS 
provided access info. If you check my IBS driver, I store the incoming 
access info from IBS into per-cpu buffers before pushing them on to the 
subsystem that act on them. I would think pull-based DAMON interfaces 
can consume those buffered samples rather than IBS pushing samples into 
DAMON. But I am yet to get clarity on how to honor the region based 
sampling that is inherent to DAMON's functioning. May be only using 
samples that are of interest to the region being tracked could be one way.

> 
>>
>> This is just an early attempt to check what it takes to maintain
>> a single source of page hotness info and also separate hot page
>> detection mechanisms from the promotion mechanism. There are too
>> many open ends right now and I have listed a few of them below.
>>
>> - The API that is provided to register memory access expects
>>    the PFN, NID and time of access at the minimum. This is
>>    described more in patch 2/4. This API currently can be called
>>    only from contexts that allow sleeping and hence this rules
>>    out using it from PTE scanning paths. The API needs to be
>>    more flexible with respect to this.
>> - Some sources like PTE A bit scanning can't provide the precise
>>    time of access or the NID that is accessing the page. The latter
>>    has been an open problem to which I haven't come across a good
>>    and acceptable solution.
> 
> Agree.  PTE A bit scanning could be useful in many cases, but not every case.
> There was an RFC patchset[7] that extends DAMON for NID.  I'm planning to do
> that again using DAMON operations layer interface.  My current plan is to
> implement the prototype using prot_none page faults, and later extend for AMD
> IBS like h/w features.  Hopefully I will share a prototype or at least more
> detailed idea on LSFMMBPF 2025.
> 
>> - The way the hot page information is maintained is pretty
>>    primitive right now. Ideally we would like to store hotness info
>>    in such a way that it should be easily possible to lookup say N
>>    most hot pages.
> 
> DAMON provides a feature for lookup of N most hotpages, namely DAMOS quotas'
> access pattern based regions prioritization[5].
> 
>> - If PTE A bit scanners are considered as hotness sources, we will
>>    be bombarded with accesses. Do we want to accomodate all those
>>    accesses or just go with hotness info for fixed number of pages
>>    (possibly as a ratio of lower tier memory capacity)?
> 
> I understand you're saying about memory space overhead.  Correct me if I'm
> wrong, please.

Correct and also the overhead of managing so much data. What I see is 
that if I start pushing all the access info obtained from LRU pgtable 
scanning, kpromoted would end up spending a lot of time in operations 
like lookup, walking the list of hot pages etc.

So may be it would be better to do some sort of early processing and/or 
filtering at the hotness source level itself before letting 
kpromoted-like subsystems to do further tracking and action.

> 
> Isn't same issue exists for current implementation of the sampling frequency is
> high, and/or aggregation window is long?
> 
> To me, hence, this looks like not a problem of the information source, but how
> to maintain the information.  Current implementation maintains it per page, so
> I think the problem is inherent.

Well yes, but we the goal could be do better than NUMAB=2 which does 
per-page level tracking.

> 
> DAMON maintains the information in region abstraction that can save multiple
> pages with one data structure.  The maximum number of regions can be set by
> users, so the space overhead can be controlled.

The granularity of tracking - per-page vs range/region is a topic of 
discussion I suppose.

Regards,
Bharata.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ