[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7CLSuSfRwMYqNL9ZU3ehpZfR6oewHsYtLD4CTXcvEKOTg@mail.gmail.com>
Date: Mon, 8 Sep 2025 22:34:04 +0800
From: Kairui Song <ryncsn@...il.com>
To: Klara Modin <klarasmodin@...il.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>, Hugh Dickins <hughd@...gle.com>, Chris Li <chrisl@...nel.org>,
Barry Song <baohua@...nel.org>, Baoquan He <bhe@...hat.com>, Nhat Pham <nphamcs@...il.com>,
Kemeng Shi <shikemeng@...weicloud.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Ying Huang <ying.huang@...ux.alibaba.com>, Johannes Weiner <hannes@...xchg.org>,
David Hildenbrand <david@...hat.com>, Yosry Ahmed <yosryahmed@...gle.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 11/15] mm, swap: use the swap table for the swap cache
and switch API
On Sun, Sep 7, 2025 at 8:59 PM Klara Modin <klarasmodin@...il.com> wrote:
>
> On 2025-09-06 03:13:53 +0800, Kairui Song wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > Introduce basic swap table infrastructures, which are now just a
> > fixed-sized flat array inside each swap cluster, with access wrappers.
> >
> > Each cluster contains a swap table of 512 entries. Each table entry is
> > an opaque atomic long. It could be in 3 types: a shadow type (XA_VALUE),
> > a folio type (pointer), or NULL.
> >
> > In this first step, it only supports storing a folio or shadow, and it
> > is a drop-in replacement for the current swap cache. Convert all swap
> > cache users to use the new sets of APIs. Chris Li has been suggesting
> > using a new infrastructure for swap cache for better performance, and
> > that idea combined well with the swap table as the new backing
> > structure. Now the lock contention range is reduced to 2M clusters,
> > which is much smaller than the 64M address_space. And we can also drop
> > the multiple address_space design.
> >
> > All the internal works are done with swap_cache_get_* helpers. Swap
> > cache lookup is still lock-less like before, and the helper's contexts
> > are same with original swap cache helpers. They still require a pin
> > on the swap device to prevent the backing data from being freed.
> >
> > Swap cache updates are now protected by the swap cluster lock
> > instead of the Xarray lock. This is mostly handled internally, but new
> > __swap_cache_* helpers require the caller to lock the cluster. So, a
> > few new cluster access and locking helpers are also introduced.
> >
> > A fully cluster-based unified swap table can be implemented on top
> > of this to take care of all count tracking and synchronization work,
> > with dynamic allocation. It should reduce the memory usage while
> > making the performance even better.
> >
> > Co-developed-by: Chris Li <chrisl@...nel.org>
> > Signed-off-by: Chris Li <chrisl@...nel.org>
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
> > MAINTAINERS | 1 +
> > include/linux/swap.h | 2 -
> > mm/huge_memory.c | 13 +-
> > mm/migrate.c | 19 ++-
> > mm/shmem.c | 8 +-
> > mm/swap.h | 157 +++++++++++++++++------
> > mm/swap_state.c | 289 +++++++++++++++++++------------------------
> > mm/swap_table.h | 97 +++++++++++++++
> > mm/swapfile.c | 100 +++++++++++----
> > mm/vmscan.c | 20 ++-
> > 10 files changed, 458 insertions(+), 248 deletions(-)
> > create mode 100644 mm/swap_table.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 1c8292c0318d..de402ca91a80 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -16226,6 +16226,7 @@ F: include/linux/swapops.h
> > F: mm/page_io.c
> > F: mm/swap.c
> > F: mm/swap.h
> > +F: mm/swap_table.h
> > F: mm/swap_state.c
> > F: mm/swapfile.c
> >
>
> ...
>
> > #include <linux/swapops.h> /* for swp_offset */
>
> Now that swp_offset() is used in folio_index(), should this perhaps also be
> included for !CONFIG_SWAP?
Hi, Thanks for looking at this series.
>
> > #include <linux/blk_types.h> /* for bio_end_io_t */
> >
...
> > if (unlikely(folio_test_swapcache(folio)))
>
> > - return swap_cache_index(folio->swap);
> > + return swp_offset(folio->swap);
>
> This is outside CONFIG_SWAP.
Right, but there are users of folio_index that are outside of
CONFIG_SWAP (mm/migrate.c), and swp_offset is also outside of SWAP so
that's OK.
If we wrap it, the CONFIG_SWAP build will fail. I've test !CONFIG_SWAP
build on this patch and after the whole series, it works fine.
We should drop the usage of folio_index in migrate.c, that's not
really related to this series though.
>
> > return folio->index;
> > }
>
> ...
>
> Regards,
> Klara Modin
>
Powered by blists - more mailing lists