lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 21 Jun 2023 02:26:03 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Domenico Cerasuolo <cerasuolodomenico@...il.com>
Cc:     Hyeonggon Yoo <42.hyeyoo@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Seth Jennings <sjenning@...hat.com>,
        Dan Streetman <ddstreet@...e.org>,
        Vitaly Wool <vitaly.wool@...sulko.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Nhat Pham <nphamcs@...il.com>, Yu Zhao <yuzhao@...gle.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG mm-unstable] "kernel BUG at mm/swap.c:393!" on commit b9c91c43412f2e

On Wed, Jun 21, 2023 at 2:19 AM Domenico Cerasuolo
<cerasuolodomenico@...il.com> wrote:
>
> On Wed, Jun 21, 2023 at 10:06 AM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> > On Wed, Jun 21, 2023 at 12:01 AM Hyeonggon Yoo <42.hyeyoo@...il.com> wrote:
> > >
> > > On Wed, Jun 07, 2023 at 07:51:43PM +0000, Yosry Ahmed wrote:
> > > > Commit 71024cb4a0bf ("frontswap: remove frontswap_tmem_exclusive_gets")
> > > > removed support for exclusive loads from frontswap as it was not used.
> > > > Bring back exclusive loads support to frontswap by adding an "exclusive"
> > > > output parameter to frontswap_ops->load.
> > > >
> > > > On the zswap side, add a module parameter to enable/disable exclusive
> > > > loads, and a config option to control the boot default value.
> > > > Refactor zswap entry invalidation in zswap_frontswap_invalidate_page()
> > > > into zswap_invalidate_entry() to reuse it in zswap_frontswap_load() if
> > > > exclusive loads are enabled.
> > > >
> > > > With exclusive loads, we avoid having two copies of the same page in
> > > > memory (compressed & uncompressed) after faulting it in from zswap. On
> > > > the other hand, if the page is to be reclaimed again without being
> > > > dirtied, it will be re-compressed. Compression is not usually slow, and
> > > > a page that was just faulted in is less likely to be reclaimed again
> > > > soon.
> > > >
> > > > Suggested-by: Yu Zhao <yuzhao@...gle.com>
> > > > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
> > > > ---
> > > >
> > > > v1 -> v2:
> > > > - Add a module parameter to control whether exclusive loads are enabled
> > > >   or not, the config option now controls the default boot value instead.
> > > >   Replaced frontswap_ops->exclusive_loads by an output parameter to
> > > >   frontswap_ops->load() (Johannes Weiner).
> > > > ---
> > >
> > > Hi Yosry, I was testing the latest mm-unstable and encountered a bug.
> > > It was bisectable and this is the first bad commit.
> > >
> > >
> > > Attached config file and bisect log.
> > > The oops message is available at:
> > >
> > > https://social.kernel.org/media/eace06d71655b3cc76411366573e4a8ce240ad65b8fd20977d7c73eec9dc2253.jpg
> > >
> > > (the head commit is b9c91c43412f2e07 "mm: zswap: support exclusive loads")
> > > (it's an image because I tested it on real machine)
> > >
> > >
> > > This is what I have as swap space:
> > >
> > > $ cat /proc/swaps
> > > Filename                                Type            Size            Used            Priority
> > > /var/swap                               file            134217724       0               -2
> > > /dev/zram0                              partition       8388604         0               100
> >
> >
> > Hi Hyeonggon,
> >
> > Thanks for reporting this! I think I know what went wrong. Could you
> > please verify if the below fix works if possible?
> >
> > Domenico, I believe the below fix would also fix a problem with the
> > recent writeback series. If the entry is invalidated before we grab the
> > lock to put the local ref in zswap_frontswap_load(), then the entry
> > will be freed once we call zswap_entry_put(), and the movement to the
> > beginning LRU will be operating on a freed entry. It also modifies
> > your recently added commit 418fd29d9de5 ("mm: zswap: invaldiate entry
> > after writeback"). I would appreciate it if you also take a look.
>
> Hi Yosry,
>
> Thanks, this makes sense indeed. I've been running a stress test too for
> an hour now and it seems fine.

Thanks! I will send the patch to Andrew then!

>
> >
> > If this works as intended, I can send a formal patch (applies on top
> > of fd247f029cd0 ("mm/gup: do not return 0 from pin_user_pages_fast()
> > for bad args")):
> >
> > From 4b7f949b3ffb42d969d525d5b576fad474f55276 Mon Sep 17 00:00:00 2001
> > From: Yosry Ahmed <yosryahmed@...gle.com>
> > Date: Wed, 21 Jun 2023 07:43:51 +0000
> > Subject: [PATCH] mm: zswap: fix double invalidate with exclusive loads
> >
> > If exclusive loads are enabled for zswap, we invalidate the entry before
> > returning from zswap_frontswap_load(), after dropping the local
> > reference. However, the tree lock is dropped during decompression after
> > the local reference is acquired, so the entry could be invalidated
> > before we drop the local ref. If this happens, the entry is freed once
> > we drop the local ref, and zswap_invalidate_entry() tries to invalidate
> > an already freed entry.
> >
> > Fix this by:
> > (a) Making sure zswap_invalidate_entry() is always called with a local
> >     ref held, to avoid being called on a freed entry.
> > (b) Making sure zswap_invalidate_entry() only drops the ref if the entry
> >     was actually on the rbtree. Otherwise, another invalidation could
> >     have already happened, and the initial ref is already dropped.
> >
> > With these changes, there is no need to check that there is no need to
> > make sure the entry still exists in the tree in zswap_reclaim_entry()
> > before invalidating it, as zswap_reclaim_entry() will make this check
> > internally.
> >
> > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
> > Reported-by: Hyeonggon Yoo <42.hyeyoo@...il.com>
> > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
> > ---
> >  mm/zswap.c | 21 ++++++++++++---------
> >  1 file changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 87b204233115..62195f72bf56 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -355,12 +355,14 @@ static int zswap_rb_insert(struct rb_root *root,
> > struct zswap_entry *entry,
> >         return 0;
> >  }
> >
> > -static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *entry)
> > +static bool zswap_rb_erase(struct rb_root *root, struct zswap_entry *entry)
> >  {
> >         if (!RB_EMPTY_NODE(&entry->rbnode)) {
> >                 rb_erase(&entry->rbnode, root);
> >                 RB_CLEAR_NODE(&entry->rbnode);
> > +               return true;
> >         }
> > +       return false;
> >  }
> >
> >  /*
> > @@ -599,14 +601,16 @@ static struct zswap_pool
> > *zswap_pool_find_get(char *type, char *compressor)
> >         return NULL;
> >  }
> >
> > +/*
> > + * If the entry is still valid in the tree, drop the initial ref and remove it
> > + * from the tree. This function must be called with an additional ref held,
> > + * otherwise it may race with another invalidation freeing the entry.
> > + */
> >  static void zswap_invalidate_entry(struct zswap_tree *tree,
> >                                    struct zswap_entry *entry)
> >  {
> > -       /* remove from rbtree */
> > -       zswap_rb_erase(&tree->rbroot, entry);
> > -
> > -       /* drop the initial reference from entry creation */
> > -       zswap_entry_put(tree, entry);
> > +       if (zswap_rb_erase(&tree->rbroot, entry))
> > +               zswap_entry_put(tree, entry);
> >  }
> >
> >  static int zswap_reclaim_entry(struct zswap_pool *pool)
> > @@ -659,8 +663,7 @@ static int zswap_reclaim_entry(struct zswap_pool *pool)
> >          * swapcache. Drop the entry from zswap - unless invalidate already
> >          * took it out while we had the tree->lock released for IO.
> >          */
> > -       if (entry == zswap_rb_search(&tree->rbroot, swpoffset))
> > -               zswap_invalidate_entry(tree, entry);
> > +       zswap_invalidate_entry(tree, entry);
> >
> >  put_unlock:
> >         /* Drop local reference */
> > @@ -1466,7 +1469,6 @@ static int zswap_frontswap_load(unsigned type,
> > pgoff_t offset,
> >                 count_objcg_event(entry->objcg, ZSWPIN);
> >  freeentry:
> >         spin_lock(&tree->lock);
> > -       zswap_entry_put(tree, entry);
> >         if (!ret && zswap_exclusive_loads_enabled) {
> >                 zswap_invalidate_entry(tree, entry);
> >                 *exclusive = true;
> > @@ -1475,6 +1477,7 @@ static int zswap_frontswap_load(unsigned type,
> > pgoff_t offset,
> >                 list_move(&entry->lru, &entry->pool->lru);
> >                 spin_unlock(&entry->pool->lru_lock);
> >         }
> > +       zswap_entry_put(tree, entry);
> >         spin_unlock(&tree->lock);
> >
> >         return ret;
> > --
> > 2.41.0.162.gfafddb0af9-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ