linux-kernel - Re: [PATCH v2 13/28] mm: migrate: prevent memory cgroup release in folio_migrate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c998d9ef-5538-4ad2-9a95-a20ef299ce55@linux.dev>
Date: Mon, 22 Dec 2025 11:42:42 +0800
From: Qi Zheng <qi.zheng@...ux.dev>
To: Johannes Weiner <hannes@...xchg.org>,
 "David Hildenbrand (Red Hat)" <david@...nel.org>
Cc: hughd@...gle.com, mhocko@...e.com, roman.gushchin@...ux.dev,
 shakeel.butt@...ux.dev, muchun.song@...ux.dev, lorenzo.stoakes@...cle.com,
 ziy@...dia.com, harry.yoo@...cle.com, imran.f.khan@...cle.com,
 kamalesh.babulal@...cle.com, axelrasmussen@...gle.com, yuanchu@...gle.com,
 weixugc@...gle.com, chenridong@...weicloud.com, mkoutny@...e.com,
 akpm@...ux-foundation.org, hamzamahfooz@...ux.microsoft.com,
 apais@...ux.microsoft.com, lance.yang@...ux.dev, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
 Muchun Song <songmuchun@...edance.com>, Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v2 13/28] mm: migrate: prevent memory cgroup release in
 folio_migrate_mapping()



On 12/18/25 10:26 PM, Johannes Weiner wrote:
> On Thu, Dec 18, 2025 at 10:09:21AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 12/17/25 08:27, Qi Zheng wrote:
>>> From: Muchun Song <songmuchun@...edance.com>
>>>
>>> In the near future, a folio will no longer pin its corresponding
>>> memory cgroup. To ensure safety, it will only be appropriate to
>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>> returned by folio_memcg(), thereby preventing it from being released.
>>>
>>> In the current patch, the rcu read lock is employed to safeguard
>>> against the release of the memory cgroup in folio_migrate_mapping().
>>
>> We usually avoid talking about "patches".
>>
>> In __folio_migrate_mapping(), the rcu read lock ...
>>
>>>
>>> This serves as a preparatory measure for the reparenting of the
>>> LRU pages.
>>>
>>> Signed-off-by: Muchun Song <songmuchun@...edance.com>
>>> Signed-off-by: Qi Zheng <zhengqi.arch@...edance.com>
>>> Reviewed-by: Harry Yoo <harry.yoo@...cle.com>
>>> ---
>>>    mm/migrate.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 5169f9717f606..8bcd588c083ca 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
>>>    		struct lruvec *old_lruvec, *new_lruvec;
>>>    		struct mem_cgroup *memcg;
>>>    
>>> +		rcu_read_lock();
>>>    		memcg = folio_memcg(folio);
>>
>> In general, LGTM
>>
>> I wonder, though, whether we should embed that in the ABI.
>>
>> Like "lock RCU and get the memcg" in one operation, to the "return memcg
>> and unock rcu" in another operation.
>>
>> Something like "start / end" semantics.
> 
> The advantage of open-coding this particular one is that 1)
> rcu_read_lock() is something the caller could already be
> holding/using, implicitly or explicitly; and 2) it's immediately
> obvious that this is an atomic section (which was already useful in
> spotting a bug in the workingset patch of this series).
> 
> "start/end" terminology hides this. "lock" we can't use because it
> would suggest binding stability. The only other idea I'd have would be
> to spell it all out:
> 
> memcg = folio_memcg_rcu_read_lock(folio);
> stuff(memcg);
> otherstuff();
> rcu_read_unlock();
> 
> But that might not be worth it. Maybe somebody can think of a better
> name. But I'd be hesitant to trade off the obviousness of what's going
> on given how simple the locking + access scheme is.

Agree. I also prefer to keep the open-coding method for now, and if a
better helper is available later, a cleanup patch can be added to
accomplish this.

Thanks,
Qi