[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250121005452.GB610565@tiffany>
Date: Tue, 21 Jan 2025 09:54:52 +0900
From: Hyesoo Yu <hyesoo.yu@...sung.com>
To: Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc: janghyuck.kim@...sung.com, Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>, Christoph Lameter <cl@...ux.com>, Pekka
Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>, Joonsoo
Kim <iamjoonsoo.kim@....com>, Vlastimil Babka <vbabka@...e.cz>, Roman
Gushchin <roman.gushchin@...ux.dev>, linux-mm@...ck.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: slub: Panic if the object corruption is checked.
On Tue, Jan 21, 2025 at 12:41:01AM +0900, Hyeonggon Yoo wrote:
> On Mon, Jan 20, 2025 at 5:30 PM Hyesoo Yu <hyesoo.yu@...sung.com> wrote:
> >
> > If a slab object is corrupted or an error occurs in its internal
> > value, continuing after restoration may cause other side effects.
> > At this point, it is difficult to debug because the problem occurred
> > in the past. A flag has been added that can cause a panic when there
> > is a problem with the object.
>
> Hi Hyesoo,
>
> I'm concerned about this because it goes against the effort to avoid
> introducing new BUG() calls [1].
>
> And I think it would be more appropriate to use existing panic_on_warn
> functionality [2] which causes
> a panic on WARN(), rather than introducing a SLUB-specific knob to do
> the same thing.
>
> However SLUB does not call WARN() and uses pr_err() instead when
> reporting an error.
> Vlastimil and I talked about changing it to use WARN() a while ago
> [3], but neither of us
> have done that yet.
>
> Probably you may want to look at it, as it also aligns with your purpose?
> FYI, if you would like to work on it, please make sure that it WARN()
> is suppressed during kunit test.
>
> [1] https://docs.kernel.org/process/deprecated.html#bug-and-bug-on
> [2] https://www.kernel.org/doc/html/v6.9/admin-guide/sysctl/kernel.html#panic-on-warn
> [3] https://lore.kernel.org/linux-mm/d4219cd9-32d3-4697-93b9-6a44bf77d50c@suse.cz
>
> Best,
> Hyeonggon
Thanks for response.
Using warn() instead of panic, is a great idea.
Thanks for pointing out what I missed.
The next patch will be changed to use warn().
Thanks.
>
> > Signed-off-by: Hyesoo Yu <hyesoo.yu@...sung.com>
> > Change-Id: I4e7e5e0ec3421a7f6c84d591db052f79d3775493
> > ---
> > Documentation/mm/slub.rst | 2 ++
> > include/linux/slab.h | 4 ++++
> > mm/slub.c | 14 ++++++++++++++
> > 3 files changed, 20 insertions(+)
> >
> > diff --git a/Documentation/mm/slub.rst b/Documentation/mm/slub.rst
> > index 84ca1dc94e5e..ce58525db93d 100644
> > --- a/Documentation/mm/slub.rst
> > +++ b/Documentation/mm/slub.rst
> > @@ -53,6 +53,7 @@ Possible debug options are::
> > U User tracking (free and alloc)
> > T Trace (please only use on single slabs)
> > A Enable failslab filter mark for the cache
> > + C Panic if object corruption is checked.
> > O Switch debugging off for caches that would have
> > caused higher minimum slab orders
> > - Switch all debugging off (useful if the kernel is
> > @@ -113,6 +114,7 @@ options from the ``slab_debug`` parameter translate to the following files::
> > U store_user
> > T trace
> > A failslab
> > + C corruption_panic
> >
> > failslab file is writable, so writing 1 or 0 will enable or disable
> > the option at runtime. Write returns -EINVAL if cache is an alias.
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 10a971c2bde3..4391c30564d6 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -31,6 +31,7 @@ enum _slab_flag_bits {
> > _SLAB_CACHE_DMA32,
> > _SLAB_STORE_USER,
> > _SLAB_PANIC,
> > + _SLAB_CORRUPTION_PANIC,
> > _SLAB_TYPESAFE_BY_RCU,
> > _SLAB_TRACE,
> > #ifdef CONFIG_DEBUG_OBJECTS
> > @@ -97,6 +98,9 @@ enum _slab_flag_bits {
> > #define SLAB_STORE_USER __SLAB_FLAG_BIT(_SLAB_STORE_USER)
> > /* Panic if kmem_cache_create() fails */
> > #define SLAB_PANIC __SLAB_FLAG_BIT(_SLAB_PANIC)
> > +/* Panic if object corruption is checked */
> > +#define SLAB_CORRUPTION_PANIC __SLAB_FLAG_BIT(_SLAB_CORRUPTION_PANIC)
> > +
> > /**
> > * define SLAB_TYPESAFE_BY_RCU - **WARNING** READ THIS!
> > *
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 48cefc969480..36a8dabf1349 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1306,6 +1306,8 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab)
> > fault, end - 1, fault - start);
> > print_section(KERN_ERR, "Padding ", pad, remainder);
> >
> > + BUG_ON(s->flags & SLAB_CORRUPTION_PANIC);
> > +
> > restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
> > }
> >
> > @@ -1389,6 +1391,8 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> > if (!ret && !slab_in_kunit_test()) {
> > print_trailer(s, slab, object);
> > add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> > +
> > + BUG_ON(s->flags & SLAB_CORRUPTION_PANIC);
> > }
> >
> > return ret;
> > @@ -1689,6 +1693,9 @@ parse_slub_debug_flags(char *str, slab_flags_t *flags, char **slabs, bool init)
> > case 'a':
> > *flags |= SLAB_FAILSLAB;
> > break;
> > + case 'c':
> > + *flags |= SLAB_CORRUPTION_PANIC;
> > + break;
> > case 'o':
> > /*
> > * Avoid enabling debugging on caches if its minimum
> > @@ -6874,6 +6881,12 @@ static ssize_t store_user_show(struct kmem_cache *s, char *buf)
> >
> > SLAB_ATTR_RO(store_user);
> >
> > +static ssize_t corruption_panic_show(struct kmem_cache *s, char *buf)
> > +{
> > + return sysfs_emit(buf, "%d\n", !!(s->flags & SLAB_CORRUPTION_PANIC));
> > +}
> > +SLAB_ATTR_RO(corruption_panic);
> > +
> > static ssize_t validate_show(struct kmem_cache *s, char *buf)
> > {
> > return 0;
> > @@ -7092,6 +7105,7 @@ static struct attribute *slab_attrs[] = {
> > &red_zone_attr.attr,
> > &poison_attr.attr,
> > &store_user_attr.attr,
> > + &corruption_panic_attr.attr,
> > &validate_attr.attr,
> > #endif
> > #ifdef CONFIG_ZONE_DMA
> > --
> > 2.48.0
> >
>
Powered by blists - more mailing lists