lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkbO938ETJn0FOG_vVU4V2_dBanio1QG56cp6ctFHpSeNw@mail.gmail.com>
Date: Fri, 23 Aug 2024 09:07:44 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Nhat Pham <nphamcs@...il.com>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>, Piotr Oniszczuk <piotr.oniszczuk@...il.com>, 
	LKML <linux-kernel@...r.kernel.org>, Johannes Weiner <hannes@...xchg.org>, 
	Linux-MM <linux-mm@...ck.org>
Subject: Re: [regression] oops on heavy compilations ("kernel BUG at
 mm/zswap.c:1005!" and "Oops: invalid opcode: 0000")

On Fri, Aug 23, 2024 at 7:47 AM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Fri, Aug 23, 2024 at 10:35:19AM -0400, Nhat Pham wrote:
> > On Fri, Aug 23, 2024 at 9:13 AM Matthew Wilcox <willy@...radead.org> wrote:
> > >
> > >
> > > That said, zswap could handle this better.  There's no need to panic the
> > > entire machine over being unable to read a page from swap.  Killing just
> > > the process that needed this page is sufficient.
> >
> > Agree 100%. It is silly to kill the entire host for a swap read error,
> > and extra silly to kill the process because we fail to writeback - for
> > all we know that page might never be needed by the process again!!!
> >
> > >
> > > Suggested patch at end after the oops.
> > >
> > > @@ -1601,6 +1613,7 @@ bool zswap_load(struct folio *folio)
> > >         bool swapcache = folio_test_swapcache(folio);
> > >         struct xarray *tree = swap_zswap_tree(swp);
> > >         struct zswap_entry *entry;
> > > +       int err;
> > >
> > >         VM_WARN_ON_ONCE(!folio_test_locked(folio));
> > >
> > > @@ -1638,10 +1651,13 @@ bool zswap_load(struct folio *folio)
> > >         if (!entry)
> > >                 return false;
> > >
> > > -       if (entry->length)
> > > -               zswap_decompress(entry, folio);
> > > -       else
> > > +       if (entry->length) {
> > > +               err = zswap_decompress(entry, folio);
> > > +               if (err)
> > > +                       return false;
> >
> > Here, if zswap decompression fails and zswap load returns false, the
> > page_io logic will proceed as if zswap does not have the page and
> > reads garbage from the backing device instead. This could potentially
> > lead to silent data/memory corruption right? Or am I missing something
> > :) Maybe we could be extra careful here and treat it as if there is a
> > bio read error in the case zswap owns the page, but cannot decompress
> > it?
>
> Ah; you know more about how zswap works than I do.  So it's not a
> write-through cache?  I guess we need to go a bit further then and
> return an errno from zswap_load -- EIO/ENOENT/0 and handle that
> appropriately.

It should work if we just return true without calling
folio_mark_uptodate(), this is what we do if we get a large folio in
zswap_load(). Returning true means that the page was found in zswap,
so we won't fallback to reading from the backing device. Not marking
the folio uptodate will cause an IO error IIUC.

>
> > The rest seems solid to me :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ