[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3595bfda-d7a6-44a1-bc0a-414c890e00c2@linux.dev>
Date: Tue, 2 Dec 2025 11:04:02 +0800
From: Qi Zheng <qi.zheng@...ux.dev>
To: Yuanchu Xie <yuanchu@...gle.com>
Cc: Harry Yoo <harry.yoo@...cle.com>, hannes@...xchg.org, hughd@...gle.com,
mhocko@...e.com, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
muchun.song@...ux.dev, david@...hat.com, lorenzo.stoakes@...cle.com,
ziy@...dia.com, imran.f.khan@...cle.com, kamalesh.babulal@...cle.com,
axelrasmussen@...gle.com, weixugc@...gle.com, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v1 23/26] mm: vmscan: prepare for reparenting MGLRU folios
On 12/2/25 5:50 AM, Yuanchu Xie wrote:
> On Mon, Dec 1, 2025 at 9:41 AM Qi Zheng <qi.zheng@...ux.dev> wrote:
>>> Warning 1) Here we increment max_seq but we skip updating mm_state->seq.
>>> (try_to_inc_max_seq() iterates the mm list and update mm_state->seq after
>>> an iteration, but since we directly call inc_max_seq(), we don't update it)
>>>
>>> When mm_state->seq is more than one generation behind walk->seq, a warning is
>>> triggered in iterate_mm_list():
>>>
>>> VM_WARN_ON_ONCE(mm_state->seq + 1 < walk->max_seq);
>>
>> The mm_state->seq is just to record the completion of a full traversal
>> of mm_list. If we simply delete this warning, it may cause this judgment
>> in iterate_mm_list to become invalid:
>>
>> if (walk->seq <= mm_state->seq)
>> goto done;
>>
>> So it seems we can manually increase mm_state->seq during reparenting to
>> avoid this warning.
> Agreed, don't get rid of the warning as this check is supposed to make
> stale walkers exit early.
OK, will incease mm_state->seq during reparenting in the next version.
>
>>
>> However, we cannot directly call iterate_mm_list_nowalk() because we do
>> not want to reset mm_state->head and mm_state->tail to NULL. Otherwise,
>> we wouldn't be able to continue iterating over the mm_list.
>>
>
> From the original posting:
>> Of course, the same generation has different meanings in the parent and
>> child memcg, this will cause confusion in the hot and cold information of
>> folios. But other than that, this method is simple enough, the lru size
>> is correct, and there is no need to consider some concurrency issues (such
>> as lru_gen_del_folio()).
> One way to solve this is to map generations based on
> lrugen->timestamp, but of course this runs into the reading
> folio->flags issue you described. I think the current method is a good
> compromise, but the splicing of generations doesn't much make semantic
> sense. It would be good to leave a comment somewhere in
> __lru_gen_reparent_memcg to note this weirdness.
OK, will do.
Thanks!
Powered by blists - more mailing lists