lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aUTQuyUV4_ed9tSU@hyeyoo>
Date: Fri, 19 Dec 2025 13:12:43 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Qi Zheng <qi.zheng@...ux.dev>
Cc: "David Hildenbrand (Red Hat)" <david@...nel.org>, hannes@...xchg.org,
        hughd@...gle.com, mhocko@...e.com, roman.gushchin@...ux.dev,
        shakeel.butt@...ux.dev, muchun.song@...ux.dev,
        lorenzo.stoakes@...cle.com, ziy@...dia.com, imran.f.khan@...cle.com,
        kamalesh.babulal@...cle.com, axelrasmussen@...gle.com,
        yuanchu@...gle.com, weixugc@...gle.com, chenridong@...weicloud.com,
        mkoutny@...e.com, akpm@...ux-foundation.org,
        hamzamahfooz@...ux.microsoft.com, apais@...ux.microsoft.com,
        lance.yang@...ux.dev, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org, Muchun Song <songmuchun@...edance.com>,
        Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v2 13/28] mm: migrate: prevent memory cgroup release in
 folio_migrate_mapping()

On Thu, Dec 18, 2025 at 09:16:11PM +0800, Qi Zheng wrote:
> 
> 
> On 12/18/25 9:04 PM, David Hildenbrand (Red Hat) wrote:
> > On 12/18/25 14:00, Qi Zheng wrote:
> > > 
> > > 
> > > On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
> > > > On 12/18/25 12:40, Qi Zheng wrote:
> > > > > 
> > > > > 
> > > > > On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
> > > > > > On 12/18/25 10:36, Qi Zheng wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
> > > > > > > > On 12/17/25 08:27, Qi Zheng wrote:
> > > > > > > > > From: Muchun Song <songmuchun@...edance.com>
> > > > > > > > > 
> > > > > > > > > In the near future, a folio will no longer pin its corresponding
> > > > > > > > > memory cgroup. To ensure safety, it will only be appropriate to
> > > > > > > > > hold the rcu read lock or acquire a reference to the memory cgroup
> > > > > > > > > returned by folio_memcg(), thereby
> > > > > > > > > preventing it from being released.
> > > > > > > > > 
> > > > > > > > > In the current patch, the rcu read lock is employed to safeguard
> > > > > > > > > against the release of the memory cgroup in
> > > > > > > > > folio_migrate_mapping().
> > > > > > > > 
> > > > > > > > We usually avoid talking about "patches".
> > > > > > > 
> > > > > > > Got it.
> > > > > > > 
> > > > > > > > 
> > > > > > > > In __folio_migrate_mapping(), the rcu read lock ...
> > > > > > > 
> > > > > > > Will do.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > This serves as a preparatory measure for the reparenting of the
> > > > > > > > > LRU pages.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Muchun Song <songmuchun@...edance.com>
> > > > > > > > > Signed-off-by: Qi Zheng <zhengqi.arch@...edance.com>
> > > > > > > > > Reviewed-by: Harry Yoo <harry.yoo@...cle.com>
> > > > > > > > > ---
> > > > > > > > >      mm/migrate.c | 2 ++
> > > > > > > > >      1 file changed, 2 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > > > > > index 5169f9717f606..8bcd588c083ca 100644
> > > > > > > > > --- a/mm/migrate.c
> > > > > > > > > +++ b/mm/migrate.c
> > > > > > > > > @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
> > > > > > > > > address_space *mapping,
> > > > > > > > >              struct lruvec *old_lruvec, *new_lruvec;
> > > > > > > > >              struct mem_cgroup *memcg;
> > > > > > > > > +        rcu_read_lock();
> > > > > > > > >              memcg = folio_memcg(folio);
> > > > > > > > 
> > > > > > > > In general, LGTM
> > > > > > > > 
> > > > > > > > I wonder, though, whether we should embed that in the ABI.
> > > > > > > > 
> > > > > > > > Like "lock RCU and get the memcg" in one operation, to the "return
> > > > > > > > memcg
> > > > > > > > and unock rcu" in another operation.
> > > > > > > 
> > > > > > > Do you mean adding a helper function like
> > > > > > > get_mem_cgroup_from_folio()?
> > > > > > 
> > > > > > Right, something like
> > > > > > 
> > > > > > memcg = folio_memcg_begin(folio);
> > > > > > folio_memcg_end(memcg);
> > > > > 
> > > > > For some longer or might-sleep critical sections (such as those pointed
> > > > > by Johannes), perhaps it can be defined like this:
> > > > > 
> > > > > struct mem_cgroup *folio_memcg_begin(struct folio *folio)
> > > > > {
> > > > >      return get_mem_cgroup_from_folio(folio);
> > > > > }
> > > > > 
> > > > > void folio_memcg_end(struct mem_cgroup *memcg)
> > > > > {
> > > > >      mem_cgroup_put(memcg);
> > > > > }
> > > > > 
> > > > > But for some short critical sections, using RCU lock directly might
> > > > > be the most convention option?
> > > > > 
> > > > 
> > > > Then put the rcu read locking in there instead?
> > > 
> > > So for some longer or might-sleep critical sections, using:
> > > 
> > > memcg = folio_memcg_begin(folio);
> > > do_some_thing(memcg);
> > > folio_memcg_end(folio);
> > > 
> > > for some short critical sections, using:
> > > 
> > > rcu_read_lock();
> > > memcg = folio_memcg(folio);
> > > do_some_thing(memcg);
> > > rcu_read_unlock();
> > > 
> > > Right?
> > 
> > What I mean is:
> > 
> > memcg = folio_memcg_begin(folio);
> > do_some_thing(memcg);
> > folio_memcg_end(folio);
> > 
> > but do the rcu_read_lock() in folio_memcg_begin() and the
> > rcu_read_unlock() in folio_memcg_end().
> > 
> > You could also have (expensive) variants, as you describe, that mess
> > with getting/dopping the memcg.
> 
> Or simple use folio_memcg_begin(memcg)/folio_memcg_end(memcg) in all cases.
> 
> Or add a parameter to them:
> 
> struct mem_cgroup *folio_memcg_begin(struct folio *folio, bool get_refcnt)
> {
> 	struct mem_cgroup *memcg;
> 
> 	if (get_refcnt)
> 		memcg = get_mem_cgroup_from_folio(folio);
> 	else {
> 		rcu_read_lock();
> 		memcg = folio_memcg(folio);
> 	}
> 
> 	return memcg;
> }
> 
> void folio_memcg_end(struct mem_cgroup *memcg, bool get_refcnt)
> {
> 	if (get_refcnt)
> 		mem_cgroup_put(memcg);
> 	else
> 		rcu_read_unlock();
> }

I would like to vote for open coding as we do now, because I think hiding
the RCU lock / refcount acquisition into a less obvious API doesn't make
it more readable.

No strong opinion on introducing new helpers, but at least it should be
obvious that each variant has different restrictions.
 
> > But my points was about hiding the rcu details in a set of helpers.
> > 
> > Sorry if what I say is confusing.

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ