linux-kernel - Re: [PATCH 1/2] mm: migrate: restore the nmask after successfully allocating on the target node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <059b42b154f04b50833743c513733089@inspur.com>
Date: Thu, 29 May 2025 03:03:12 +0000
From: Simon Wang (王传国) <wangchuanguo@...pur.com>
To: SeongJae Park <sj@...nel.org>
CC: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"hannes@...xchg.org" <hannes@...xchg.org>, "david@...hat.com"
	<david@...hat.com>, "mhocko@...nel.org" <mhocko@...nel.org>,
	"zhengqi.arch@...edance.com" <zhengqi.arch@...edance.com>,
	"shakeel.butt@...ux.dev" <shakeel.butt@...ux.dev>,
	"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "damon@...ts.linux.dev"
	<damon@...ts.linux.dev>, Jagdish Gediya <jvgediya.oss@...il.com>
Subject: Re: [PATCH 1/2] mm: migrate: restore the nmask after successfully
 allocating on the  target node


> + Jagdish, since seems the behavior that this patch tries to change is
> apparently made by Jagdish's commit 320080272892 ("mm/demotion:
> demote pages according to allocation fallback order").
> 
> On Wed, 28 May 2025 19:10:37 +0800 wangchuanguo
> <wangchuanguo@...pur.com> wrote:
> 
> > If memory is successfully allocated on the target node and the
> > function directly returns without value restore for nmask, non-first
> > migration operations in migrate_pages() by again label may ignore the
> > nmask settings,
> 
> Nice finding!
> 
> > thereby allowing new memory
> > allocations for migration on any node.
> 
> But, isn't the consequence of this behavior is the opposite?  That is, I think
> this behavior restricts to use only the specified node (mtc->nid) in the case,
> ignoring more allowed fallback nodes (mtc->nmask)?
> 
> Anyway, to me, this seems not an intended behavior but a bug.  Cc-ing
> Jagdish, who authored the commit 320080272892 ("mm/demotion: demote
> pages according to allocation fallback order"), which apparently made this
> behavior initially, though, since I may misreading the original author's
> intention.
> 

Under the original logic, the alloc_migrate_folio function would attempt to allocate new memory sequentially across all nodes based on distance, even for nodes at the same tier, which is nonsensical. For example, if nodes 0 and 1 are DRAM nodes and nodes 2 and 3 are CXL nodes, attempting to promote a hot page from node 2 to node 0 would erroneously fall back to nodes 2 and 3 (the same tier as the source node) if nodes 0 and 1 are out of space. This is a BUG.In Patch 1, I fix this BUG. 

In Patch 2, I extend the target node range from node 0 to nodes 0 and 1. To accommodate users who require strict migration (e.g., migrating only to node 0 and aborting if it is full), I added a sysfs toggle in Patch 2.
Question: Should this sysfs toggle default to true (allow fallback to other nodes) or false (strict mode: migrate only to node 0, abort if full)? I would appreciate your advice on the default value, considering backward compatibility and use cases.

> >
> > Signed-off-by: wangchuanguo <wangchuanguo@...pur.com>
> > ---
> >  mm/vmscan.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c index
> > f8dfd2864bbf..e13f17244279 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1035,11 +1035,11 @@ struct folio *alloc_migrate_folio(struct folio
> *src, unsigned long private)
> >  	mtc->nmask = NULL;
> >  	mtc->gfp_mask |= __GFP_THISNODE;
> >  	dst = alloc_migration_target(src, (unsigned long)mtc);
> > +	mtc->nmask = allowed_mask;
> >  	if (dst)
> >  		return dst;
> 
> Restoring ->nmask looks right behavior to me.  But, if so, shouldn't we also
> restore ->gfp_mask?

Yes, it's a good idea. I will do it.
 

> >
> >  	mtc->gfp_mask &= ~__GFP_THISNODE;
> > -	mtc->nmask = allowed_mask;
> >
> >  	return alloc_migration_target(src, (unsigned long)mtc);  }
> > --
> > 2.39.3
> 
> 
> Thanks,
> SJ