[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <97516c66-ddb0-4bf6-a827-1b052bb5eca0@linux.alibaba.com>
Date: Fri, 9 Aug 2024 11:21:44 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: David Hildenbrand <david@...hat.com>, akpm@...ux-foundation.org,
hughd@...gle.com, wangkefeng.wang@...wei.com, chrisl@...nel.org,
ying.huang@...el.com, 21cnbao@...il.com, ryan.roberts@....com,
shy828301@...il.com, ziy@...dia.com, ioworker0@...il.com,
da.gomez@...sung.com, p.raghav@...sung.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
Luis Chamberlain <mcgrof@...nel.org>
Subject: Re: [PATCH v4 01/10] mm: vmscan: add validation before spliting shmem
large folio
On 2024/8/8 20:35, Matthew Wilcox wrote:
> On Thu, Aug 08, 2024 at 10:36:23AM +0800, Baolin Wang wrote:
>> On 2024/8/7 23:53, David Hildenbrand wrote:
>>> But now I am wondering under which circumstances we end up calling
>>> shmem_writepage() with a large folio. And I think the answer is the
>>> comment of
>>> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>>>
>>>
>>> ... so if shmem_writepage() handles+checks that, could we do
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a332cb80e928..7dfa3d6e8ba7 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> goto
>>> activate_locked_split;
>>> }
>>> }
>>> - } else if (folio_test_swapbacked(folio) &&
>>> - folio_test_large(folio)) {
>>> - /* Split shmem folio */
>>> - if (split_folio_to_list(folio, folio_list))
>>> - goto keep_locked;
>>> }
>>>
>>> /*
>>>
>>> instead?
>>
>> Seems reasonable to me. But we should pass the 'folio_list' to
>> shmem_writepage() to list the subpages of the large folio. Let me try.
>> Thanks.
>
> We should be trying to remove shmem_writepage(), not make it more
> complex. We're making good progress removing instances of ->writepage;
> just ceph, ecryptfs, f2fs, gfs2, hostfs, nilfs2, orangefs, vboxsf, shmem
> & swap are left. gfs2 patches are out for review.
I am afraid shmem is a bit special. IIUC, ->writepages() is used to
write back some dirty pages from the mapping by the writeback flusher
thread, but shmem cannot be written back (mapping_can_writeback() will
return false). Therefore, shmem can only be reclaimed through direct
reclaim or kswapd if a swap device is set up (if no swap, shmem should
always be kept in memory). So currently, we should still keep
shmem_writepage() to reclaim shmem pages.
> As you can see from previous patches, the approach is to use
> ->writepages instead of ->writepage. There should be no need to
> handle a folio split list as splitting a folio leaves the folios in the
> page cache and they'll naturally be found by subsequent iterations.
Again, shmem is special. If shmem folio is reclaimable (when a swap
device is set up), we need to allocate contiguous swap entries for large
folios. However, if there is significant fragmentation of swap entries
(there is already a topic to talk about this issue), we will not able to
allocate contiguous swap entries for large shmem folios. Therefore, in
this case, it is necessary to split the large shmem folio in order to
try to allocate a singe swap entry for reclaiming shmem.
Powered by blists - more mailing lists