[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7DAbMMmPVPqiGg_-2vghpPc9Jn8rkVNDFgq3reFx6CZtw@mail.gmail.com>
Date: Wed, 3 Sep 2025 20:54:50 +0800
From: Kairui Song <ryncsn@...il.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>, Hugh Dickins <hughd@...gle.com>, Chris Li <chrisl@...nel.org>,
Barry Song <baohua@...nel.org>, Baoquan He <bhe@...hat.com>, Nhat Pham <nphamcs@...il.com>,
Kemeng Shi <shikemeng@...weicloud.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Ying Huang <ying.huang@...ux.alibaba.com>, Johannes Weiner <hannes@...xchg.org>,
Yosry Ahmed <yosryahmed@...gle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Zi Yan <ziy@...dia.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 6/9] mm, swap: use the swap table for the swap cache and
switch API
On Wed, Sep 3, 2025 at 7:44 PM David Hildenbrand <david@...hat.com> wrote:
>
> On 22.08.25 21:20, Kairui Song wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > Introduce basic swap table infrastructures, which are now just a
> > fixed-sized flat array inside each swap cluster, with access wrappers.
> >
> > Each cluster contains a swap table of 512 entries. Each table entry is
> > an opaque atomic long. It could be in 3 types: a shadow type (XA_VALUE),
> > a folio type (pointer), or NULL.
> >
> > In this first step, it only supports storing a folio or shadow, and it
> > is a drop-in replacement for the current swap cache. Convert all swap
> > cache users to use the new sets of APIs. Chris Li has been suggesting
> > using a new infrastructure for swap cache for better performance, and
> > that idea combined well with the swap table as the new backing
> > structure. Now the lock contention range is reduced to 2M clusters,
> > which is much smaller than the 64M address_space. And we can also drop
> > the multiple address_space design.
> >
> > All the internal works are done with swap_cache_get_* helpers. Swap
> > cache lookup is still lock-less like before, and the helper's contexts
> > are same with original swap cache helpers. They still require a pin
> > on the swap device to prevent the backing data from being freed.
> >
> > Swap cache updates are now protected by the swap cluster lock
> > instead of the Xarray lock. This is mostly handled internally, but new
> > __swap_cache_* helpers require the caller to lock the cluster. So, a
> > few new cluster access and locking helpers are also introduced.
> >
> > A fully cluster-based unified swap table can be implemented on top
> > of this to take care of all count tracking and synchronization work,
> > with dynamic allocation. It should reduce the memory usage while
> > making the performance even better.
> >
> > Co-developed-by: Chris Li <chrisl@...nel.org>
> > Signed-off-by: Chris Li <chrisl@...nel.org>
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
>
> [...]
>
> > @@ -4504,7 +4504,7 @@ static void filemap_cachestat(struct address_space *mapping,
> > * invalidation, so there might not be
> > * a shadow in the swapcache (yet).
> > */
> > - shadow = get_shadow_from_swap_cache(swp);
> > + shadow = swap_cache_get_shadow(swp);
> > if (!shadow)
> > goto resched;
> > }
>
> This looks like a cleanup that can be performed separately upfront to
> make this patch smaller.
>
> Same applies to delete_from_swap_cache->swap_cache_del_folio
I can have a patch to rename and add kernel doc / comments in swap.h
for a few helpers like this one. That will make this patch a bit
smaller.
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 2a47cd3bb649..209580d395a1 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3721,7 +3721,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> > /* Prevent deferred_split_scan() touching ->_refcount */
> > spin_lock(&ds_queue->split_queue_lock);
> > if (folio_ref_freeze(folio, 1 + extra_pins)) {
> > - struct address_space *swap_cache = NULL;
> > + struct swap_cluster_info *swp_ci = NULL;
>
> I'm wondering if we could also perform this change upfront, so we can ...
This one seems not very doable on itsown since the cluster idea wasn't
used out side of swap before this patch..
>
> > struct lruvec *lruvec;
> > int expected_refs;
> >
> > @@ -3765,8 +3765,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> > goto fail;
> > }
> >
> > - swap_cache = swap_address_space(folio->swap);
> > - xa_lock(&swap_cache->i_pages);
> > + swp_ci = swap_cluster_lock_by_folio(folio);
>
> ... perform these cleanups outside of the main patch. Just a thought.
>
>
> Because this patch is rather big and touches quite some code (hard to
> review)
Thanks for the review!
>
> --
> Cheers
>
> David / dhildenb
>
>
Powered by blists - more mailing lists