lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 26 Sep 2020 09:43:25 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Roman Gushchin <guro@...com>, Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Theodore Y . Ts'o" <tytso@....edu>, Jens Axboe <axboe@...nel.dk>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        linux-block <linux-block@...r.kernel.org>,
        Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling
 into blk_mq_get_driver_tag

On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote:
> On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt <shakeelb@...gle.com>
> wrote:
> >
> [snip]
> >
> > I don't think you can ignore the flushing. The __free_once() in
> > ___cache_free() assumes there is a space available.
> >
> > BTW do_drain() also have the same issue.
> >
> > Why not move slabs_destroy() after we update ac->avail and memmove()?
> 
> Ming, can you please try the following patch?
> 
> 
> From: Shakeel Butt <shakeelb@...gle.com>
> 
> [PATCH] mm: slab: fix potential infinite recursion in ___cache_free
> 
> With the commit 10befea91b61 ("mm: memcg/slab: use a single set of
> kmem_caches for all allocations"), it becomes possible to call kfree()
> from the slabs_destroy(). However if slabs_destroy() is being called for
> the array_cache of the local CPU then this opens the potential scenario
> of infinite recursion because kfree() called from slabs_destroy() can
> call slabs_destroy() with the same array_cache of the local CPU. Since
> the array_cache of the local CPU is not updated before calling
> slabs_destroy(), it will try to free the same pages.
> 
> To fix the issue, simply update the cache before calling
> slabs_destroy().
> 
> Signed-off-by: Shakeel Butt <shakeelb@...gle.com>
> ---
>  mm/slab.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 3160dff6fd76..f658e86ec8ce 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1632,6 +1632,10 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page)
>  		kmem_cache_free(cachep->freelist_cache, freelist);
>  }
>  
> +/*
> + * Update the size of the caches before calling slabs_destroy as it may
> + * recursively call kfree.
> + */
>  static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
>  {
>  	struct page *page, *n;
> @@ -2153,8 +2157,8 @@ static void do_drain(void *arg)
>  	spin_lock(&n->list_lock);
>  	free_block(cachep, ac->entry, ac->avail, node, &list);
>  	spin_unlock(&n->list_lock);
> -	slabs_destroy(cachep, &list);
>  	ac->avail = 0;
> +	slabs_destroy(cachep, &list);
>  }
>  
>  static void drain_cpu_caches(struct kmem_cache *cachep)
> @@ -3402,9 +3406,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
>  	}
>  #endif
>  	spin_unlock(&n->list_lock);
> -	slabs_destroy(cachep, &list);
>  	ac->avail -= batchcount;
>  	memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail);
> +	slabs_destroy(cachep, &list);
>  }

The issue can't be reproduced after applying this patch:

Tested-by: Ming Lei <ming.lei@...hat.com>

Thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ