[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4z9xTd=oHDuzLbdkyhd_F=tj2A3K_dsp33dXad6pvVZpA@mail.gmail.com>
Date: Tue, 4 Nov 2025 16:26:57 +0800
From: Barry Song <21cnbao@...il.com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>, Nhat Pham <nphamcs@...il.com>,
Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosry.ahmed@...ux.dev>,
David Hildenbrand <david@...hat.com>, Youngjun Park <youngjun.park@....com>,
Hugh Dickins <hughd@...gle.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Huang, Ying" <ying.huang@...ux.alibaba.com>, Kemeng Shi <shikemeng@...weicloud.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>, linux-kernel@...r.kernel.org,
Kairui Song <kasong@...cent.com>
Subject: Re: [PATCH 04/19] mm, swap: always try to free swap cache for
SWP_SYNCHRONOUS_IO devices
On Tue, Nov 4, 2025 at 12:19 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Wed, Oct 29, 2025 at 11:59 PM Kairui Song <ryncsn@...il.com> wrote:
> >
> > From: Kairui Song <kasong@...cent.com>
> >
> > Now SWP_SYNCHRONOUS_IO devices are also using swap cache. One side
> > effect is that a folio may stay in swap cache for a longer time due to
> > lazy freeing (vm_swap_full()). This can help save some CPU / IO if folios
> > are being swapped out very frequently right after swapin, hence improving
> > the performance. But the long pinning of swap slots also increases the
> > fragmentation rate of the swap device significantly, and currently,
> > all in-tree SWP_SYNCHRONOUS_IO devices are RAM disks, so it also
> > causes the backing memory to be pinned, increasing the memory pressure.
> >
> > So drop the swap cache immediately for SWP_SYNCHRONOUS_IO devices
> > after swapin finishes. Swap cache has served its role as a
> > synchronization layer to prevent any parallel swapin from wasting
> > CPU or memory allocation, and the redundant IO is not a major concern
> > for SWP_SYNCHRONOUS_IO devices.
> >
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
> > mm/memory.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9a43d4811781..78457347ae60 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4359,12 +4359,21 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
> > return 0;
> > }
> >
> > -static inline bool should_try_to_free_swap(struct folio *folio,
> > +static inline bool should_try_to_free_swap(struct swap_info_struct *si,
> > + struct folio *folio,
> > struct vm_area_struct *vma,
> > unsigned int fault_flags)
> > {
> > if (!folio_test_swapcache(folio))
> > return false;
> > + /*
> > + * Try to free swap cache for SWP_SYNCHRONOUS_IO devices.
> > + * Redundant IO is unlikely to be an issue for them, but a
> > + * slot being pinned by swap cache may cause more fragmentation
> > + * and delayed freeing of swap metadata.
> > + */
>
> I don’t like the claim about “redundant I/O” — it sounds misleading. Those
> I/Os are not redundant; they are simply saved by swapcache, which prevents
> some swap-out I/O when a recently swap-in folio is swapped out again.
>
> So, could we make it a bit more specific in both the comment and the commit
> message?
Sorry, on second thought—consider a case where process A mmaps 100 MB and writes
to it to populate memory, then forks process B. If that 100 MB gets swapped out,
and A and B later swap it in separately for reading, with this change it seems
they would each get their own 100 MB copy (total 2 × 100 MB), whereas previously
they could share the same 100 MB?
Thanks
Barry
Powered by blists - more mailing lists