linux-kernel - Re: [PATCH RFC] mm/memcontrol: make lru_zone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <fa7b6432-a35b-4768-91dd-926b45bb6980@huaweicloud.com>
Date: Tue, 16 Dec 2025 09:38:07 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm@...ck.org, Johannes Weiner <hannes@...xchg.org>,
 Hugh Dickins <hughd@...gle.com>, Michal Hocko <mhocko@...nel.org>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
 Andrew Morton <akpm@...ux-foundation.org>, cgroups@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] mm/memcontrol: make lru_zone_size atomic and simplify
 sanity check



On 2025/12/15 16:29, Kairui Song wrote:
> On Mon, Dec 15, 2025 at 3:38 PM Chen Ridong <chenridong@...weicloud.com> wrote:
>>
>> On 2025/12/15 14:45, Kairui Song wrote:
>>> From: Kairui Song <kasong@...cent.com>
>>>
>>> commit ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size")
>>> introduced a sanity check to catch memcg counter underflow, which is
>>> more like a workaround for another bug: lru_zone_size is unsigned, so
>>> underflow will wrap it around and return an enormously large number,
>>> then the memcg shrinker will loop almost forever as the calculated
>>> number of folios to shrink is huge. That commit also checks if a zero
>>> value matches the empty LRU list, so we have to hold the LRU lock, and
>>> do the counter adding differently depending on whether the nr_pages is
>>> negative.
>>>
>>> But later commit b4536f0c829c ("mm, memcg: fix the active list aging for
>>> lowmem requests when memcg is enabled") already removed the LRU
>>> emptiness check, doing the adding differently is meaningless now. And if
>>
>> Agree.
>>
>> I did submit a patch to address that, but it was not accepted.
>>
>> For reference, here is the link to the discussion:
>> https://lore.kernel.org/lkml/CAOUHufbCCkOBGcSPZqNY+FXcrH8+U7_nRvftzOzKUBS4hn+kuQ@mail.gmail.com/
>>
> 
> Thanks for letting me know, I wasn't aware that you noticed this too.
> 
>>>From my side I'm noticing that, lru_zone_size has only one reader:
> shrink_lruvec -> get_scan_count -> lruvec_lru_size, while the updater
> is every folio on LRU.
> 
> The oldest commit introduced this was trying to catch an underflow,
> with extra sanity check to catch LRU emptiness mis-match. A later
> commit removed the LRU emptiness mis-match, and the only thing left
> here is the underflow check.
> 
> LRU counter leak (if there are) may happen anytime due to different
> reasons, and I think the time an updater sees an underflowed value is
> not unlikely to be the time the actual leak happens. (e.g. A folio was

This is exactly our concern. If we don’t read the LRU size, we won’t get a warning even if it has
overflowed. I’d like to hear the experts’ opinions on this.

> removed without updating the counter minutes ago while there are other
> folios still on the LRU, then an updater will trigger the WARN much
> later). So the major issue here is the underflow mitigation.
> 
> Turning it into an atomic long should help mitigate both underflow,
> and race (e.g. forgot to put the counter update inside LRU lock).
> Overflow is really unlikely to happen IMO, and if that's corruption,
> corruption could happen anywhere.
> 
> The reason I sent this patch is trying to move
> mem_cgroup_update_lru_size out of lru lock scope in another series for
> some other feature, just to gather some comments for this particular
> sanity check, it seems a valid clean up and micro optimization on its
> own.

I agree — it is confusing. If several of us have been confused by this, maybe it’s time we consider
making some changes.

-- 
Best regards,
Ridong