[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c492b99e-b6fb-4da8-8055-1e93c6141a12@redhat.com>
Date: Wed, 3 Sep 2025 13:41:52 +0200
From: David Hildenbrand <david@...hat.com>
To: Kairui Song <kasong@...cent.com>, linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>, Hugh Dickins <hughd@...gle.com>,
Chris Li <chrisl@...nel.org>, Barry Song <baohua@...nel.org>,
Baoquan He <bhe@...hat.com>, Nhat Pham <nphamcs@...il.com>,
Kemeng Shi <shikemeng@...weicloud.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Ying Huang <ying.huang@...ux.alibaba.com>,
Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosryahmed@...gle.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 6/9] mm, swap: use the swap table for the swap cache and
switch API
On 22.08.25 21:20, Kairui Song wrote:
> From: Kairui Song <kasong@...cent.com>
>
> Introduce basic swap table infrastructures, which are now just a
> fixed-sized flat array inside each swap cluster, with access wrappers.
>
> Each cluster contains a swap table of 512 entries. Each table entry is
> an opaque atomic long. It could be in 3 types: a shadow type (XA_VALUE),
> a folio type (pointer), or NULL.
>
> In this first step, it only supports storing a folio or shadow, and it
> is a drop-in replacement for the current swap cache. Convert all swap
> cache users to use the new sets of APIs. Chris Li has been suggesting
> using a new infrastructure for swap cache for better performance, and
> that idea combined well with the swap table as the new backing
> structure. Now the lock contention range is reduced to 2M clusters,
> which is much smaller than the 64M address_space. And we can also drop
> the multiple address_space design.
>
> All the internal works are done with swap_cache_get_* helpers. Swap
> cache lookup is still lock-less like before, and the helper's contexts
> are same with original swap cache helpers. They still require a pin
> on the swap device to prevent the backing data from being freed.
>
> Swap cache updates are now protected by the swap cluster lock
> instead of the Xarray lock. This is mostly handled internally, but new
> __swap_cache_* helpers require the caller to lock the cluster. So, a
> few new cluster access and locking helpers are also introduced.
>
> A fully cluster-based unified swap table can be implemented on top
> of this to take care of all count tracking and synchronization work,
> with dynamic allocation. It should reduce the memory usage while
> making the performance even better.
>
> Co-developed-by: Chris Li <chrisl@...nel.org>
> Signed-off-by: Chris Li <chrisl@...nel.org>
> Signed-off-by: Kairui Song <kasong@...cent.com>
> ---
[...]
> @@ -4504,7 +4504,7 @@ static void filemap_cachestat(struct address_space *mapping,
> * invalidation, so there might not be
> * a shadow in the swapcache (yet).
> */
> - shadow = get_shadow_from_swap_cache(swp);
> + shadow = swap_cache_get_shadow(swp);
> if (!shadow)
> goto resched;
> }
This looks like a cleanup that can be performed separately upfront to
make this patch smaller.
Same applies to delete_from_swap_cache->swap_cache_del_folio
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2a47cd3bb649..209580d395a1 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3721,7 +3721,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> /* Prevent deferred_split_scan() touching ->_refcount */
> spin_lock(&ds_queue->split_queue_lock);
> if (folio_ref_freeze(folio, 1 + extra_pins)) {
> - struct address_space *swap_cache = NULL;
> + struct swap_cluster_info *swp_ci = NULL;
I'm wondering if we could also perform this change upfront, so we can ...
> struct lruvec *lruvec;
> int expected_refs;
>
> @@ -3765,8 +3765,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> goto fail;
> }
>
> - swap_cache = swap_address_space(folio->swap);
> - xa_lock(&swap_cache->i_pages);
> + swp_ci = swap_cluster_lock_by_folio(folio);
... perform these cleanups outside of the main patch. Just a thought.
Because this patch is rather big and touches quite some code (hard to
review)
--
Cheers
David / dhildenb
Powered by blists - more mailing lists