[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkYJ9KNQkpPxWQpGj0SnofSq5-mLzDnChCqTtJPGrhzY-A@mail.gmail.com>
Date: Thu, 29 Aug 2024 15:53:32 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Piotr Oniszczuk <piotr.oniszczuk@...il.com>, Pedro Falcato <pedro.falcato@...il.com>,
Nhat Pham <nphamcs@...il.com>,
Linux regressions mailing list <regressions@...ts.linux.dev>, LKML <linux-kernel@...r.kernel.org>,
Johannes Weiner <hannes@...xchg.org>, Linux-MM <linux-mm@...ck.org>
Subject: Re: [regression] oops on heavy compilations ("kernel BUG at
mm/zswap.c:1005!" and "Oops: invalid opcode: 0000")
On Thu, Aug 29, 2024 at 3:29 PM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Thu, Aug 29, 2024 at 02:54:25PM -0700, Yosry Ahmed wrote:
> > Looking at the zswap commits between 6.8 and 6.9, ignoring cleanups
> > and seemingly irrelevant patches (e.g. swapoff fixups), I think the
> > some likely candidates could be the following, but this is not really
> > based on any scientific methodology:
> >
> > 44c7c734a5132 mm/zswap: split zswap rb-tree
> > c2e2ba770200b mm/zswap: only support zswap_exclusive_loads_enabled
> > a230c20e63efe mm/zswap: zswap entry doesn't need refcount anymore
> > 8409a385a6b41 mm/zswap: improve with alloc_workqueue() call
> > 0827a1fb143fa mm/zswap: invalidate zswap entry when swap entry free
> >
> > I also noticed that you are using z3fold as the zpool. Is the problem
> > reproducible with zsmalloc? I wouldn't be surprised if there's a
> > z3fold bug somewhere.
>
> You're assuming that it's a zswap/zsmalloc/... bug. If it's a random
> scribble, as suggested by Takero Funaki:
>
> https://lore.kernel.org/linux-mm/CAPpoddere2g=kkMzrxuJ1KCG=0Hg1-1v=ppg4dON9wK=pKq2uQ@mail.gmail.com/
>
> then focusing on zswap will not be fruitful.
IIUC that was for the initial bug report. Piotr reported a different
problem for v6.9 in the same thread, a soft lockup. They look
unrelated to me. Also the patch that Takero found through bisection
landed in v6.10, so it cannot be the cause of the soft lockups.
Piotr never confirmed if reverting patch Takero found fixes the
initial problem on v6.10 though, that would be useful.
Powered by blists - more mailing lists