[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7DJMtN9_r9L5jatZN8_0GS=pKYLPcArd=ie4rBSwHWXGw@mail.gmail.com>
Date: Tue, 4 Nov 2025 18:55:09 +0800
From: Kairui Song <ryncsn@...il.com>
To: Barry Song <21cnbao@...il.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>, Nhat Pham <nphamcs@...il.com>,
Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosry.ahmed@...ux.dev>,
David Hildenbrand <david@...hat.com>, Youngjun Park <youngjun.park@....com>,
Hugh Dickins <hughd@...gle.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Huang, Ying" <ying.huang@...ux.alibaba.com>, Kemeng Shi <shikemeng@...weicloud.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 04/19] mm, swap: always try to free swap cache for
SWP_SYNCHRONOUS_IO devices
On Tue, Nov 4, 2025 at 4:27 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Tue, Nov 4, 2025 at 12:19 PM Barry Song <21cnbao@...il.com> wrote:
> >
> > On Wed, Oct 29, 2025 at 11:59 PM Kairui Song <ryncsn@...il.com> wrote:
> > >
> > > From: Kairui Song <kasong@...cent.com>
> > >
> > > Now SWP_SYNCHRONOUS_IO devices are also using swap cache. One side
> > > effect is that a folio may stay in swap cache for a longer time due to
> > > lazy freeing (vm_swap_full()). This can help save some CPU / IO if folios
> > > are being swapped out very frequently right after swapin, hence improving
> > > the performance. But the long pinning of swap slots also increases the
> > > fragmentation rate of the swap device significantly, and currently,
> > > all in-tree SWP_SYNCHRONOUS_IO devices are RAM disks, so it also
> > > causes the backing memory to be pinned, increasing the memory pressure.
> > >
> > > So drop the swap cache immediately for SWP_SYNCHRONOUS_IO devices
> > > after swapin finishes. Swap cache has served its role as a
> > > synchronization layer to prevent any parallel swapin from wasting
> > > CPU or memory allocation, and the redundant IO is not a major concern
> > > for SWP_SYNCHRONOUS_IO devices.
> > >
> > > Signed-off-by: Kairui Song <kasong@...cent.com>
> > > ---
> > > mm/memory.c | 13 +++++++++++--
> > > 1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 9a43d4811781..78457347ae60 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -4359,12 +4359,21 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
> > > return 0;
> > > }
> > >
> > > -static inline bool should_try_to_free_swap(struct folio *folio,
> > > +static inline bool should_try_to_free_swap(struct swap_info_struct *si,
> > > + struct folio *folio,
> > > struct vm_area_struct *vma,
> > > unsigned int fault_flags)
> > > {
> > > if (!folio_test_swapcache(folio))
> > > return false;
> > > + /*
> > > + * Try to free swap cache for SWP_SYNCHRONOUS_IO devices.
> > > + * Redundant IO is unlikely to be an issue for them, but a
> > > + * slot being pinned by swap cache may cause more fragmentation
> > > + * and delayed freeing of swap metadata.
> > > + */
> >
> > I don’t like the claim about “redundant I/O” — it sounds misleading. Those
> > I/Os are not redundant; they are simply saved by swapcache, which prevents
> > some swap-out I/O when a recently swap-in folio is swapped out again.
> >
> > So, could we make it a bit more specific in both the comment and the commit
> > message?
>
> Sorry, on second thought—consider a case where process A mmaps 100 MB and writes
> to it to populate memory, then forks process B. If that 100 MB gets swapped out,
> and A and B later swap it in separately for reading, with this change it seems
> they would each get their own 100 MB copy (total 2 × 100 MB), whereas previously
> they could share the same 100 MB?
It's a bit tricky here, folio_free_swap only frees the swap cache if a
folio's swap count is 0, so if A swapin these folios first, the swap
cache won't be freed until B also mapped these folios and reduced the
swap count.
And this function is called should_try_to_free_swap: it's only trying
to free the swap cache if swap count == 0. I think I can add some
comments on that.
Powered by blists - more mailing lists