[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250121003959.GA610565@tiffany>
Date: Tue, 21 Jan 2025 09:40:01 +0900
From: Hyesoo Yu <hyesoo.yu@...sung.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: janghyuck.kim@...sung.com, Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>, Christoph Lameter <cl@...ux.com>, Pekka
Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>, Joonsoo
Kim <iamjoonsoo.kim@....com>, Vlastimil Babka <vbabka@...e.cz>, Roman
Gushchin <roman.gushchin@...ux.dev>, Hyeonggon Yoo <42.hyeyoo@...il.com>,
linux-mm@...ck.org, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: slub: Panic if the object corruption is checked.
On Mon, Jan 20, 2025 at 03:36:08PM +0000, Matthew Wilcox wrote:
> On Mon, Jan 20, 2025 at 05:28:21PM +0900, Hyesoo Yu wrote:
> > If a slab object is corrupted or an error occurs in its internal
> > value, continuing after restoration may cause other side effects.
> > At this point, it is difficult to debug because the problem occurred
> > in the past. A flag has been added that can cause a panic when there
> > is a problem with the object.
> >
> > Signed-off-by: Hyesoo Yu <hyesoo.yu@...sung.com>
> > Change-Id: I4e7e5e0ec3421a7f6c84d591db052f79d3775493
>
> Linux does not use Change IDs. Please omit these from future patches.
>
> Panicing is a very unfriendly approach. I think a better approach would
> be to freeze the slab where corruption is detected. That is, no future
> objects are allocated from that slab, and attempts to free objects from
> that slab become no-ops. I don't think that should be hard to implement.
>
Thanks you for your responce. That is my mistake. I will remove the change ID.
I agree that freezing is better than recovery or panic for the system's stability.
However what I want from the patch is not just to make the system run stably.
I need to immediately trigger a panic to investigate the slub.
I would like to analyze the corrupted data at that moment to check issues
like cache problem, user errors, system clock frequency and similar problems,
not just passing by without any issues.
However I agree that panic is not a friendly approach.
I will modify it to notify the problem using warn() and then use
panic_on_warn to trigger panic.
Thanks,
Regards.
Powered by blists - more mailing lists