linux-kernel - Re: [RFC v2 5/7] mm/memory: implement MM_CP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd5cd44f-3107-4337-8673-2d5667dd9ff4@lucifer.local>
Date: Thu, 31 Jul 2025 13:18:29 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: SeongJae Park <sj@...nel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>, Jann Horn <jannh@...gle.com>,
        Michal Hocko <mhocko@...e.com>, Mike Rapoport <rppt@...nel.org>,
        Pedro Falcato <pfalcato@...e.de>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Vlastimil Babka <vbabka@...e.cz>, damon@...ts.linux.dev,
        kernel-team@...a.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC v2 5/7] mm/memory: implement MM_CP_DAMON

On Tue, Jul 29, 2025 at 09:21:06PM -0700, SeongJae Park wrote:
> > > Thank you for review and comments, Lorenzo.  I fundamentally agree all your
> > > points.  I don't aim to merge this as-is.  Actually this patch series is more
> > > like POC, but apparently I was rushing.  I will try to adjust your concerns in
> > > the next version.
> >
> > Thanks.
> >
> > I do wonder whether we really can have a whole new faulting mechanism just for
> > DAMON. Because if in future, we wanted to change how this worked, we'd be
> > constrained, and it is a very specific user.
> >
> > The issue is you need the PTE to be restored to its previous state, just like
> > NUMA balancing.
> >
> > And I really really do not like this 'oh if you turn it off you can use it for
> > DAMON' thing, it's just really odd and asking for trouble.
> >
> > I think the only _workable_ version of this would be to convert the numa
> > handling to a generic 'fault then restore' type mechanism that could be hooked
> > in to by either NUMA or DAMON.
>
> I agree, and this is my current plan for the next version of this patch.

I think we'd need some way to embed this information somehow, but I'll see what
you come up with :)

>
> >
> > But really I think you'd _need_ to not have significantly different behaviour
> > between the two and _not_ constrain this to being only when NUMA balancing is
> > disabled.
>
> I agree all the points.  Especially the current interface is ambiguous and easy
> to mistake.
>
> >
> > But then you'd need to know _this_ PTE is for NUMA balancing vs. another is for
> > this stuff.
>
> Yes, this would be the ideal.  But, memorizing something in page level is ...
> always an interesting challenge in my opinion, and I have no good idea to do
> this for now :)

Yes this is the crux, or another approach will need to be taken.

>
> >
> > I'm not really sure there is an upstreamable version of this, but it'd need to
> > be made generic like that if there were.
> >
> > I think it might be worth taking some time to examine whether a version of this
> > that can be sensibly generic (which could have hooks for things) is _possible_
> > before sending a v2.
>
> Agreed, I don't need to rush.  Let's take time and discuss sufficiently. :)

Worth just playing wtih different solution.

>
> Nonetheless, I may post a followup version of this patch series that contains
> this one, even before we get a conclusion about this specific one.  I think I
> may have to do that, for sharing the future idea in a way easy to understand
> and test.  I think it might also help us at understanding the real ROI of this
> patch, and if there is another option to move forward.  In the case, I will of
> course keep RFC tag and clearly note that this patch is still under the
> discussion and not willing to be merged as is before the discussion is done.

OK, as long as you please highlight this, indicating that you ack it's not
mergeable with the current approach.

> > Well, in the above you say that you can't help but do that when a race occurs?
>
> I mean, I can't help when CONFIG_DAMON is enabled.
>
> The race (or, handling faults that caused by other entity) can hppen if and
> only if all below things happen.
>
> 1. CONFIG_DAMON is enbled.
> 2. CONFIG_NUMA_BALANCING is enabled.
> 3. The user repeatedly turns on and off NUMA balancing and fault-based mode
>    DAMON in runtime.
>
> If any of these are not true, the race can completely avoided.  I was saying
> about the case that the first condition is not met.

Well not necessarily continually, just at any stage.

I do think you're limiting yourself rather by requiring NUMA balancing to be
switched off.

But in any case, for reasons previously mentioned, I don't like this approach
and I don't think it's at all workable, rather we either need a generic 'restore
PTE state' mechanism (that can somehow store 'both OR either' NUMA/DAMON state
with them), or an entirely different approach.

Another idea might be that of converting the current NUMA balancing to something
generic, whose semantics are such that:

- It restores PTE state
- It 'happens' to relocate the mapping to the node the CPU from which the access
  was made resides upon
- It also calls a hook.

So treat the NUMA balancing aspect as sort of implicit.

Then you can just use the machinery for marking ranges for NUMA balancing for
this, having defined a hook.

If you're indifferent as to whether you get the hook called for arbitrary NUMA
balancing faults, or have a way to identify whether or not those mappings were
marked for DAMON stats, then you have what you need without having to customise
_anything_ in the general faulting code for DAMON.

Instead you'd just set up the hook for DAMON.

This is just an untested and highly theoretical thought :)

> >
> > I really don't think a version of the code that results in the wrong handler
> > being used is upstreamable, sorry.
> >
> > I've not dug into the nitty gritty details on what would happen in both cases,
> > but even if it were 100% fine _now_, this is _exactly_ the kind of thing that
> > results in horrible hard-to-debug issues later, should something change.
> >
> > Implicitly 'just having to know' that we might be in the wrong fault handler
> > seems like asking for trouble, and the RoI on an optional profiling tool (this
> > is not to take away from DAMON which is a _great_ utility, I'm saying it's
> > simply not part of the _core_) isn't there.
>
> I completely understand your concerns, thank you for nicely and patiently
> keeping this discussion.  I don't need to upstream this in short term, and open
> to every option.  So let's take sufficient time and discussions.
>
> I will take more time to think about a few potential options to move forward,
> and share those with a level of details that can help us easily further
> discuss, hopefully in a few days.

Thanks!