lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241007091245.8476-1-yuan.gao@ucloud.cn>
Date: Mon,  7 Oct 2024 17:12:43 +0800
From: "yuan.gao" <yuan.gao@...oud.cn>
To: 42.hyeyoo@...il.com
Cc: akpm@...ux-foundation.org,
	cl@...ux.com,
	iamjoonsoo.kim@....com,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	penberg@...nel.org,
	rientjes@...gle.com,
	roman.gushchin@...ux.dev,
	vbabka@...e.cz,
	yuan.gao@...oud.cn
Subject: Re: [PATCH] mm/slub: Avoid list corruption when removing a slab from the full list 

On 24/10/06 10:00PM, Hyeonggon Yoo wrote:
> On Sun, Oct 6, 2024 at 1:48 PM yuan.gao <yuan.gao@...oud.cn> wrote:
> >
> > Boot with slub_debug=UFPZ.
> >
> > If allocated object failed in alloc_consistency_checks, all objects of
> > the slab will be marked as used, and then the slab will be removed from
> > the partial list.
> >
> > When an object belonging to the slab got freed later, the remove_full()
> > function is called. Because the slab is neither on the partial list nor
> > on the full list, it eventually lead to a list corruption.
> 
> Good catch! Thanks for investigating the cause and fixing it.
> 
> > So we need to add the slab to full list in this case.
> 
> While I believe that behavior is not intended by alloc_debug_processing(),
> I can't think of a better fix here without adding some complexity.
> The approach looks fine to me.
> 
> > ---
> >  mm/slub.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 21f71cb6cc06..a99522b9efc0 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2746,6 +2746,8 @@ static void *alloc_single_from_partial(struct kmem_cache *s,
> >
> >         if (!alloc_debug_processing(s, slab, object, orig_size)) {
> >                 remove_partial(n, slab);
> > +               if (slab->inuse == slab->objects)
> > +                       add_full(s, n, slab);
> 
> Shouldn't this be (folio_test_slab(slab_folio(slab))) instead of
> (slab->inuse == slab->objects)?
> Oh wait. the kernel also should not call remove_partial() for non-slab folios.
> 
> So I think it should be:
> 
> if (!alloc_debug_processing(s, slab, object, orig_size)) {
>         if (folio_test_slab(slab_folio(slab))) {
>                 remove_partial(n, slab);
>                 add_full(s, n, slab);
>         }
> }

Thank you for reminding me of this. I didn't notice the subtle differences here. 
 
> By the way, SLUB always messes with struct page fields even when it is
> not a slab,
> and I think SLUB should avoid modifying those fields before confirming
> it is a slab.
> (specifically, calling alloc_debug_processing() before updating
> ->freelist, ->inuse fields)
> 
> That is beyond the scope of this patch, but do you want to address it
> in the next version
> of your patch series?
> 
> Cheers,
> Hyeonggon
> 

I'm glad to do that, just takes time.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ