[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf98a80a-2be0-413f-8a7a-34bb17f053cc@huawei.com>
Date: Fri, 29 Nov 2024 10:25:02 +0800
From: chenridong <chenridong@...wei.com>
To: Barry Song <21cnbao@...il.com>, Yu Zhao <yuzhao@...gle.com>
CC: Matthew Wilcox <willy@...radead.org>, Chris Li <chrisl@...nel.org>, Chen
Ridong <chenridong@...weicloud.com>, <akpm@...ux-foundation.org>,
<mhocko@...e.com>, <hannes@...xchg.org>, <yosryahmed@...gle.com>,
<yuzhao@...gle.com>, <david@...hat.com>, <ryan.roberts@....com>,
<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
<wangweiyang2@...wei.com>, <xieym_ict@...mail.com>
Subject: Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the
tail of LRU after shrinking
On 2024/11/29 7:08, Barry Song wrote:
> On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@...wei.com> wrote:
>>
>>
>>
>> On 2024/11/18 12:21, Matthew Wilcox wrote:
>>> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
>>>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@...radead.org> wrote:
>>>>>
>>>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
>>>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>>>>>> and added to swap cache folio by folio. After adding to swap cache,
>>>>>> it will submit io to writeback folio to swap, which is asynchronous.
>>>>>> When shrink_page_list is finished, the isolated folios list will be
>>>>>> moved back to the head of inactive lru. The inactive lru may just look
>>>>>> like this, with 512 filioes have been move to the head of inactive lru.
>>>>>
>>>>> I was hoping that we'd be able to stop splitting the folio when adding
>>>>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
>>>>> as a single unit.
>>>>
>>>> This is already the case: adding to the swapcache doesn’t require splitting
>>>> THPs, but failing to allocate 2MB of contiguous swap slots will.
>>>
>>> Agreed we need to understand why this is happening. As I've said a few
>>> times now, we need to stop requiring contiguity. Real filesystems don't
>>> need the contiguity (they become less efficient, but they can scatter a
>>> single 2MB folio to multiple places).
>>>
>>> Maybe Chris has a solution to this in the works?
>>>
>>
>> Hi, Chris, do you have a better idea to solve this issue?
>
> Not Chris. As I read the code again, we have already the below code to fixup
> the issue "missed folio_rotate_reclaimable()" in evict_folios():
>
> /* retry folios that may have missed
> folio_rotate_reclaimable() */
> list_move(&folio->lru, &clean);
>
> It doesn't work for you?
>
> commit 359a5e1416caaf9ce28396a65ed3e386cc5de663
> Author: Yu Zhao <yuzhao@...gle.com>
> Date: Tue Nov 15 18:38:07 2022 -0700
> mm: multi-gen LRU: retry folios written back while isolated
>
> The page reclaim isolates a batch of folios from the tail of one of the
> LRU lists and works on those folios one by one. For a suitable
> swap-backed folio, if the swap device is async, it queues that folio for
> writeback. After the page reclaim finishes an entire batch, it puts back
> the folios it queued for writeback to the head of the original LRU list.
>
> In the meantime, the page writeback flushes the queued folios also by
> batches. Its batching logic is independent from that of the page reclaim.
> For each of the folios it writes back, the page writeback calls
> folio_rotate_reclaimable() which tries to rotate a folio to the tail.
>
>
> folio_rotate_reclaimable() only works for a folio after the page reclaim
> has put it back. If an async swap device is fast enough, the page
> writeback can finish with that folio while the page reclaim is still
> working on the rest of the batch containing it. In this case, that folio
> will remain at the head and the page reclaim will not retry it before
> reaching there.
>
> This patch adds a retry to evict_folios(). After evict_folios() has
> finished an entire batch and before it puts back folios it cannot free
> immediately, it retries those that may have missed the rotation.
> Before this patch, ~60% of folios swapped to an Intel Optane missed
> folio_rotate_reclaimable(). After this patch, ~99% of missed folios were
> reclaimed upon retry.
>
> This problem affects relatively slow async swap devices like Samsung 980
> Pro much less and does not affect sync swap devices like zram or zswap at
> all.
>
>>
>> Best regards,
>> Ridong
>
> Thanks
> Barry
Thank you for your reply, Barry.
I found this issue with 5.10 version. I reproduced this issue with the
next version, but the CONFIG_LRU_GEN_ENABLED kconfig is disabled. I
tested again with CONFIG_LRU_GEN_ENABLED enabled, and this issue can be
fixed.
IIUC, the 359a5e1416caaf9ce28396a65ed3e386cc5de663 commit can only work
when CONFIG_LRU_GEN_ENABLED is enabled, but this issue exists when
CONFIG_LRU_GEN_ENABLED is disabled and it should be fixed.
I read the code of commit 359a5e1416caaf9ce28396a65ed3e386cc5de663, it
found folios that are missed to rotate in a more complicated way, but it
makes it much clearer what is being done. Should I implement in Yu
Zhao's way?
Best regards,
Ridong
Powered by blists - more mailing lists