[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7-qr6ar0HPyiYOv@google.com>
Date: Wed, 26 Feb 2025 23:58:39 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Nhat Pham <nphamcs@...il.com>
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, chengming.zhou@...ux.dev,
linux-mm@...ck.org, kernel-team@...a.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] zswap: do not crash the kernel on decompression failure
On Wed, Feb 26, 2025 at 03:29:06PM -0800, Nhat Pham wrote:
> On Tue, Feb 25, 2025 at 7:12 PM Yosry Ahmed <yosry.ahmed@...ux.dev> wrote:
> >
> > On Tue, Feb 25, 2025 at 01:32:00PM -0800, Nhat Pham wrote:
> > > Currently, we crash the kernel when a decompression failure occurs in
> > > zswap (either because of memory corruption, or a bug in the compression
> > > algorithm). This is overkill. We should only SIGBUS the unfortunate
> > > process asking for the zswap entry on zswap load, and skip the corrupted
> > > entry in zswap writeback.
> >
> > Some relevant observations/questions, but not really actionable for this
> > patch, perhaps some future work, or more likely some incoherent
> > illogical thoughts :
> >
> > (1) It seems like not making the folio uptodate will cause shmem faults
> > to mark the swap entry as hwpoisoned, but I don't see similar handling
> > for do_swap_page(). So it seems like even if we SIGBUS the process,
> > other processes mapping the same page could follow in the same
> > footsteps.
>
> poisoned, I think? It's the weird SWP_PTE_MARKER thing.
Not sure what you mean here, I am referring to the inconsistency between
shmem_swapin_folio() and do_swap_page().
>
> [...]
>
> >
> >
> > (3) If we run into a decompression failure, should we free the
> > underlying memory from zsmalloc? I don't know. On one hand if we free it
> > zsmalloc may start using it for more compressed objects. OTOH, I don't
> > think proper hwpoison handling will kick in until the page is freed.
> > Maybe we should tell zsmalloc to drop this page entirely and mark
> > objects within it as invalid? Probably not worth the hassle but
> > something to think about.
>
> This might be a fun follow up :) I guess my question is - is there a
> chance that we might recover in the future?
>
> For example, can memory (hardware) failure somehow recover, or the
> decompression algorithm somehow fix itself? I suppose not?
>
> If that is the case, one thing we can do is just free the zsmalloc
> slot, then mark the zswap entry as corrupted somehow. We can even
> invalidate the zswap entry altogether, and install a (shared)
> ZSWAP_CORRUPT_ENTRY. Future readers can check for this and exit if
> they encounter a corrupted entry?
>
> It's not common enough (lol hopefully) for me to optimize right away,
> but I can get on with it if there are actual data of this happening
> IRL/in product :)ion
I am not aware of this being a common problem, but something to keep in
mind, perhaps.
Powered by blists - more mailing lists