linux-kernel - Re: [PATCH v2] mm: shmem: skip swapcache for swapin of synchronous swap device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c59bbe76-eb2d-47fd-acbd-d3dc351ede3e@linux.alibaba.com>
Date: Mon, 6 Jan 2025 14:29:45 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: akpm@...ux-foundation.org, hughd@...gle.com, david@...hat.com,
 wangkefeng.wang@...wei.com, kasong@...cent.com,
 ying.huang@...ux.alibaba.com, 21cnbao@...il.com, ryan.roberts@....com,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: shmem: skip swapcache for swapin of synchronous
 swap device



On 2025/1/6 12:59, Baolin Wang wrote:
> 
> 
> On 2025/1/6 12:07, Matthew Wilcox wrote:
>> On Mon, Jan 06, 2025 at 11:46:04AM +0800, Baolin Wang wrote:
>>> On 2025/1/2 21:10, Matthew Wilcox wrote:
>>>> On Thu, Jan 02, 2025 at 04:40:17PM +0800, Baolin Wang wrote:
>>>>> With fast swap devices (such as zram), swapin latency is crucial to 
>>>>> applications.
>>>>> For shmem swapin, similar to anonymous memory swapin, we can skip 
>>>>> the swapcache
>>>>> operation to improve swapin latency.
>>>>
>>>> OK, but now we have more complexity.  Why can't we always skip the
>>>> swapcache on swapin?
>>>
>>> Skipping swapcache is used to swap-in shmem large folios, avoiding 
>>> the large
>>> folios being split. Meanwhile, since the IO latency of syncing swap 
>>> devices
>>> is relatively small, it won't cause the IO latency amplification issue.
>>>
>>> But for async swap devices, if we swap-in the large folio one-time, I am
>>> afraid the IO latency can be amplified. And I remember we still haven't
>>> reached an agreement here[1], so let's step by step and start with 
>>> the sync
>>> swap devices first.
>>
>> Regardless of whether we choose to swap-in an order-0 or a large folio,
>> my point is that we should always do it to the pagecache rather than the
>> swap cache.
> 
> IMO, this would miss the swap readahead algorithm in the swap case, 
> which can benefit the order-0 swap-in. We need more work to ensure that 
> skipping swapcache is helpful for all cases, which is why I'm starting 
> with sync swap devices first.

BTW, I used the SSD swap device to test the performance of skipping 
swapcache with the following hack changes, and I found that the 
performance of order-0 sequential swap-in shows a significant regression.

Without the following changes:
1G order-0 shmem swap-in: 8056 ms

With the following changes (skip swapcache):
1G order-0 shmem swap-in: 38536 ms


diff --git a/mm/page_io.c b/mm/page_io.c
index 9b983de351f9..1e22dedcd584 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -620,7 +620,6 @@ void swap_read_folio(struct folio *folio, struct 
swap_iocb **plug)
         unsigned long pflags;
         bool in_thrashing;

-       VM_BUG_ON_FOLIO(!folio_test_swapcache(folio) && !synchronous, 
folio);
         VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
         VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio);

diff --git a/mm/shmem.c b/mm/shmem.c
index e82ef1ef1c68..2902d3477520 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2295,7 +2295,7 @@ static int shmem_swapin_folio(struct inode *inode, 
pgoff_t index,
                         fallback_order0 = true;

                 /* Skip swapcache for synchronous device. */
-               if (!fallback_order0 && data_race(si->flags & 
SWP_SYNCHRONOUS_IO)) {
+               if (!fallback_order0) {
                         folio = shmem_swap_alloc_folio(inode, vma, 
index, swap, order, gfp);
                         if (!IS_ERR(folio)) {
                                 skip_swapcache = true;