lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAB=+i9Rfy_+4vvbbuLPrVUDU=9n-Mpp2ab5t2f2fq6NG=RAXYA@mail.gmail.com>
Date: Sun, 6 Oct 2024 22:00:57 +0900
From: Hyeonggon Yoo <42.hyeyoo@...il.com>
To: "yuan.gao" <yuan.gao@...oud.cn>
Cc: cl@...ux.com, penberg@...nel.org, rientjes@...gle.com, 
	iamjoonsoo.kim@....com, akpm@...ux-foundation.org, vbabka@...e.cz, 
	roman.gushchin@...ux.dev, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/slub: Avoid list corruption when removing a slab from the full list   

On Sun, Oct 6, 2024 at 1:48 PM yuan.gao <yuan.gao@...oud.cn> wrote:
>
> Boot with slub_debug=UFPZ.
>
> If allocated object failed in alloc_consistency_checks, all objects of
> the slab will be marked as used, and then the slab will be removed from
> the partial list.
>
> When an object belonging to the slab got freed later, the remove_full()
> function is called. Because the slab is neither on the partial list nor
> on the full list, it eventually lead to a list corruption.

Good catch! Thanks for investigating the cause and fixing it.

> So we need to add the slab to full list in this case.

While I believe that behavior is not intended by alloc_debug_processing(),
I can't think of a better fix here without adding some complexity.
The approach looks fine to me.

> ---
>  mm/slub.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 21f71cb6cc06..a99522b9efc0 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2746,6 +2746,8 @@ static void *alloc_single_from_partial(struct kmem_cache *s,
>
>         if (!alloc_debug_processing(s, slab, object, orig_size)) {
>                 remove_partial(n, slab);
> +               if (slab->inuse == slab->objects)
> +                       add_full(s, n, slab);

Shouldn't this be (folio_test_slab(slab_folio(slab))) instead of
(slab->inuse == slab->objects)?
Oh wait. the kernel also should not call remove_partial() for non-slab folios.

So I think it should be:

if (!alloc_debug_processing(s, slab, object, orig_size)) {
        if (folio_test_slab(slab_folio(slab))) {
                remove_partial(n, slab);
                add_full(s, n, slab);
        }
}

By the way, SLUB always messes with struct page fields even when it is
not a slab,
and I think SLUB should avoid modifying those fields before confirming
it is a slab.
(specifically, calling alloc_debug_processing() before updating
->freelist, ->inuse fields)

That is beyond the scope of this patch, but do you want to address it
in the next version
of your patch series?

Cheers,
Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ