[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fa7b6432-a35b-4768-91dd-926b45bb6980@huaweicloud.com>
Date: Tue, 16 Dec 2025 09:38:07 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm@...ck.org, Johannes Weiner <hannes@...xchg.org>,
Hugh Dickins <hughd@...gle.com>, Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] mm/memcontrol: make lru_zone_size atomic and simplify
sanity check
On 2025/12/15 16:29, Kairui Song wrote:
> On Mon, Dec 15, 2025 at 3:38 PM Chen Ridong <chenridong@...weicloud.com> wrote:
>>
>> On 2025/12/15 14:45, Kairui Song wrote:
>>> From: Kairui Song <kasong@...cent.com>
>>>
>>> commit ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size")
>>> introduced a sanity check to catch memcg counter underflow, which is
>>> more like a workaround for another bug: lru_zone_size is unsigned, so
>>> underflow will wrap it around and return an enormously large number,
>>> then the memcg shrinker will loop almost forever as the calculated
>>> number of folios to shrink is huge. That commit also checks if a zero
>>> value matches the empty LRU list, so we have to hold the LRU lock, and
>>> do the counter adding differently depending on whether the nr_pages is
>>> negative.
>>>
>>> But later commit b4536f0c829c ("mm, memcg: fix the active list aging for
>>> lowmem requests when memcg is enabled") already removed the LRU
>>> emptiness check, doing the adding differently is meaningless now. And if
>>
>> Agree.
>>
>> I did submit a patch to address that, but it was not accepted.
>>
>> For reference, here is the link to the discussion:
>> https://lore.kernel.org/lkml/CAOUHufbCCkOBGcSPZqNY+FXcrH8+U7_nRvftzOzKUBS4hn+kuQ@mail.gmail.com/
>>
>
> Thanks for letting me know, I wasn't aware that you noticed this too.
>
>>>From my side I'm noticing that, lru_zone_size has only one reader:
> shrink_lruvec -> get_scan_count -> lruvec_lru_size, while the updater
> is every folio on LRU.
>
> The oldest commit introduced this was trying to catch an underflow,
> with extra sanity check to catch LRU emptiness mis-match. A later
> commit removed the LRU emptiness mis-match, and the only thing left
> here is the underflow check.
>
> LRU counter leak (if there are) may happen anytime due to different
> reasons, and I think the time an updater sees an underflowed value is
> not unlikely to be the time the actual leak happens. (e.g. A folio was
This is exactly our concern. If we don’t read the LRU size, we won’t get a warning even if it has
overflowed. I’d like to hear the experts’ opinions on this.
> removed without updating the counter minutes ago while there are other
> folios still on the LRU, then an updater will trigger the WARN much
> later). So the major issue here is the underflow mitigation.
>
> Turning it into an atomic long should help mitigate both underflow,
> and race (e.g. forgot to put the counter update inside LRU lock).
> Overflow is really unlikely to happen IMO, and if that's corruption,
> corruption could happen anywhere.
>
> The reason I sent this patch is trying to move
> mem_cgroup_update_lru_size out of lru lock scope in another series for
> some other feature, just to gather some comments for this particular
> sanity check, it seems a valid clean up and micro optimization on its
> own.
I agree — it is confusing. If several of us have been confused by this, maybe it’s time we consider
making some changes.
--
Best regards,
Ridong
Powered by blists - more mailing lists