[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7BR=99KDiSy7o_L0u_DYsnZunyokPc6FycrdExSdrdB_w@mail.gmail.com>
Date: Wed, 18 Jun 2025 11:07:25 +0800
From: Kairui Song <ryncsn@...il.com>
To: Kemeng Shi <shikemeng@...weicloud.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Matthew Wilcox <willy@...radead.org>, Chris Li <chrisl@...nel.org>, Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>, Barry Song <baohua@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] mm/shmem, swap: avoid redundant Xarray lookup during swapin
On Wed, Jun 18, 2025 at 10:49 AM Kemeng Shi <shikemeng@...weicloud.com> wrote:
> on 6/18/2025 2:35 AM, Kairui Song wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > Currently shmem calls xa_get_order to get the swap radix entry order,
> > requiring a full tree walk. This can be easily combined with the swap
> > entry value checking (shmem_confirm_swap) to avoid the duplicated
> > lookup, which should improve the performance.
> >
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
> > mm/shmem.c | 33 ++++++++++++++++++++++++---------
> > 1 file changed, 24 insertions(+), 9 deletions(-)
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 4e7ef343a29b..0ad49e57f736 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -505,15 +505,27 @@ static int shmem_replace_entry(struct address_space *mapping,
> >
> > /*
> > * Sometimes, before we decide whether to proceed or to fail, we must check
> > - * that an entry was not already brought back from swap by a racing thread.
> > + * that an entry was not already brought back or split by a racing thread.
> > *
> > * Checking folio is not enough: by the time a swapcache folio is locked, it
> > * might be reused, and again be swapcache, using the same swap as before.
> > + * Returns the swap entry's order if it still presents, else returns -1.
> > */
> > -static bool shmem_confirm_swap(struct address_space *mapping,
> > - pgoff_t index, swp_entry_t swap)
> > +static int shmem_swap_check_entry(struct address_space *mapping, pgoff_t index,
> > + swp_entry_t swap)
> > {
> > - return xa_load(&mapping->i_pages, index) == swp_to_radix_entry(swap);
> > + XA_STATE(xas, &mapping->i_pages, index);
> > + int ret = -1;
> > + void *entry;
> > +
> > + rcu_read_lock();
> > + do {
> > + entry = xas_load(&xas);
> > + if (entry == swp_to_radix_entry(swap))
> > + ret = xas_get_order(&xas);
> > + } while (xas_retry(&xas, entry));
> > + rcu_read_unlock();
> > + return ret;
> > }
> >
> > /*
> > @@ -2256,16 +2268,20 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
> > return -EIO;
> >
> > si = get_swap_device(swap);
> > - if (!si) {
> > - if (!shmem_confirm_swap(mapping, index, swap))
> > + order = shmem_swap_check_entry(mapping, index, swap);
> > + if (unlikely(!si)) {
> > + if (order < 0)
> > return -EEXIST;
> > else
> > return -EINVAL;
> > }
> > + if (unlikely(order < 0)) {
> > + put_swap_device(si);
> > + return -EEXIST;
> > + }
> Can we re-arrange the code block as following:
> order = shmem_swap_check_entry(mapping, index, swap);
> if (unlikely(order < 0))
> return -EEXIST;
>
> si = get_swap_device(swap);
> if (!si) {
> return -EINVAL;
> ...
Hi, thanks for the suggestion.
This may lead to a trivial higher chance of getting -EINVAL when it
should return -EEXIST, leading to user space errors.
For example if this CPU get interrupted after `order =
shmem_swap_check_entry(mapping, index, swap);`, and another CPU
swapoff-ed the device. Next, we get `si = NULL` here, but the entry is
swapped in already, so it should return -EEXIST. Not -EINVAL.
The chance is really low so it's kind of trivial, we can do a `goto
failed` if got (!si) here, but it will make the logic under `failed:`
more complex. So I'd prefer to not change the original behaviour,
which looks more correct.
Powered by blists - more mailing lists