linux-kernel - Re: [PATCH] mm: slub: Panic if the object corruption is checked.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250121003959.GA610565@tiffany>
Date: Tue, 21 Jan 2025 09:40:01 +0900
From: Hyesoo Yu <hyesoo.yu@...sung.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: janghyuck.kim@...sung.com, Andrew Morton <akpm@...ux-foundation.org>,
	Jonathan Corbet <corbet@....net>, Christoph Lameter <cl@...ux.com>, Pekka
	Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>, Joonsoo
	Kim <iamjoonsoo.kim@....com>, Vlastimil Babka <vbabka@...e.cz>, Roman
	Gushchin <roman.gushchin@...ux.dev>, Hyeonggon Yoo <42.hyeyoo@...il.com>,
	linux-mm@...ck.org, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: slub: Panic if the object corruption is checked.

On Mon, Jan 20, 2025 at 03:36:08PM +0000, Matthew Wilcox wrote:
> On Mon, Jan 20, 2025 at 05:28:21PM +0900, Hyesoo Yu wrote:
> > If a slab object is corrupted or an error occurs in its internal
> > value, continuing after restoration may cause other side effects.
> > At this point, it is difficult to debug because the problem occurred
> > in the past. A flag has been added that can cause a panic when there
> > is a problem with the object.
> > 
> > Signed-off-by: Hyesoo Yu <hyesoo.yu@...sung.com>
> > Change-Id: I4e7e5e0ec3421a7f6c84d591db052f79d3775493
> 
> Linux does not use Change IDs.  Please omit these from future patches.
> 
> Panicing is a very unfriendly approach.  I think a better approach would
> be to freeze the slab where corruption is detected.  That is, no future
> objects are allocated from that slab, and attempts to free objects from
> that slab become no-ops.  I don't think that should be hard to implement.
>

Thanks you for your responce. That is my mistake. I will remove the change ID.

I agree that freezing is better than recovery or panic for the system's stability.
However what I want from the patch is not just to make the system run stably.
I need to immediately trigger a panic to investigate the slub.

I would like to analyze the corrupted data at that moment to check issues
like cache problem, user errors, system clock frequency and similar problems,
not just passing by without any issues.

However I agree that panic is not a friendly approach.
I will modify it to notify the problem using warn() and then use
panic_on_warn to trigger panic.

Thanks,
Regards.