lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMgjq7DnaD-bH1efF9c1X0XAvZaMufzBUGxxeRrRAJBzBe59+g@mail.gmail.com>
Date: Mon, 10 Nov 2025 13:33:05 +0800
From: Kairui Song <ryncsn@...il.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: linux-mm <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>, 
	Kemeng Shi <shikemeng@...weicloud.com>, Nhat Pham <nphamcs@...il.com>, 
	Baoquan He <bhe@...hat.com>, Barry Song <baohua@...nel.org>, Chris Li <chrisl@...nel.org>, 
	Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosry.ahmed@...ux.dev>, 
	Chengming Zhou <chengming.zhou@...ux.dev>, Youngjun Park <youngjun.park@....com>, 
	LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org
Subject: Re: [PATCH] Revert "mm, swap: avoid redundant swap device pinning"

Greg KH <gregkh@...uxfoundation.org> 于 2025年11月10日周一 09:01写道:
>
> On Mon, Nov 10, 2025 at 02:06:03AM +0800, Kairui Song via B4 Relay wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > This reverts commit 78524b05f1a3e16a5d00cc9c6259c41a9d6003ce.
> >
> > While reviewing recent leaf entry changes, I noticed that commit
> > 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning") isn't
> > correct. It's true that most all callers of __read_swap_cache_async are
> > already holding a swap entry reference, so the repeated swap device
> > pinning isn't needed on the same swap device, but it is possible that
> > VMA readahead (swap_vma_readahead()) may encounter swap entries from a
> > different swap device when there are multiple swap devices, and call
> > __read_swap_cache_async without holding a reference to that swap device.
> >
> > So it is possible to cause a UAF if swapoff of device A raced with
> > swapin on device B, and VMA readahead tries to read swap entries from
> > device A. It's not easy to trigger but in theory possible to cause real
> > issues. And besides, that commit made swap more vulnerable to issues
> > like corrupted page tables.
> >
> > Just revert it. __read_swap_cache_async isn't that sensitive to
> > performance after all, as it's mostly used for SSD/HDD swap devices with
> > readahead. SYNCHRONOUS_IO devices may fallback onto it for swap count >
> > 1 entries, but very soon we will have a new helper and routine for
> > such devices, so they will never touch this helper or have redundant
> > swap device reference overhead.
> >
> > Fixes: 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning")
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
> >  mm/swap_state.c | 14 ++++++--------
> >  mm/zswap.c      |  8 +-------
> >  2 files changed, 7 insertions(+), 15 deletions(-)
> >
> > diff --git a/mm/swap_state.c b/mm/swap_state.c
> > index 3f85a1c4cfd9..0c25675de977 100644
> > --- a/mm/swap_state.c
> > +++ b/mm/swap_state.c
> > @@ -406,13 +406,17 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> >               struct mempolicy *mpol, pgoff_t ilx, bool *new_page_allocated,
> >               bool skip_if_exists)
> >  {
> > -     struct swap_info_struct *si = __swap_entry_to_info(entry);
> > +     struct swap_info_struct *si;
> >       struct folio *folio;
> >       struct folio *new_folio = NULL;
> >       struct folio *result = NULL;
> >       void *shadow = NULL;
> >
> >       *new_page_allocated = false;
> > +     si = get_swap_device(entry);
> > +     if (!si)
> > +             return NULL;
> > +
> >       for (;;) {
> >               int err;
> >
> > @@ -499,6 +503,7 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> >       put_swap_folio(new_folio, entry);
> >       folio_unlock(new_folio);
> >  put_and_return:
> > +     put_swap_device(si);
> >       if (!(*new_page_allocated) && new_folio)
> >               folio_put(new_folio);
> >       return result;
> > @@ -518,16 +523,11 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> >               struct vm_area_struct *vma, unsigned long addr,
> >               struct swap_iocb **plug)
> >  {
> > -     struct swap_info_struct *si;
> >       bool page_allocated;
> >       struct mempolicy *mpol;
> >       pgoff_t ilx;
> >       struct folio *folio;
> >
> > -     si = get_swap_device(entry);
> > -     if (!si)
> > -             return NULL;
> > -
> >       mpol = get_vma_policy(vma, addr, 0, &ilx);
> >       folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
> >                                       &page_allocated, false);
> > @@ -535,8 +535,6 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> >
> >       if (page_allocated)
> >               swap_read_folio(folio, plug);
> > -
> > -     put_swap_device(si);
> >       return folio;
> >  }
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 5d0f8b13a958..aefe71fd160c 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -1005,18 +1005,12 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> >       struct folio *folio;
> >       struct mempolicy *mpol;
> >       bool folio_was_allocated;
> > -     struct swap_info_struct *si;
> >       int ret = 0;
> >
> >       /* try to allocate swap cache folio */
> > -     si = get_swap_device(swpentry);
> > -     if (!si)
> > -             return -EEXIST;
> > -
> >       mpol = get_task_policy(current);
> >       folio = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol,
> > -                     NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
> > -     put_swap_device(si);
> > +                             NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
> >       if (!folio)
> >               return -ENOMEM;
> >
> >
> > ---
> > base-commit: 02dafa01ec9a00c3758c1c6478d82fe601f5f1ba
> > change-id: 20251109-revert-78524b05f1a3-04a1295bef8a
> >
> > Best regards,
> > --
> > Kairui Song <kasong@...cent.com>
> >
> >
> >
>
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>

Thanks for the info, my bad, I was trying new tools to send patches so
the Cc tags were missing, will fix it. This patch is meant to be
merged into the mainline first.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ