linux-kernel - Re: slub/debugobjects: lockup when freeing memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 20 Aug 2014 05:19:59 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Sasha Levin <sasha.levin@...cle.com>,
	Pekka Enberg <penberg@...nel.org>,
	Matt Mackall <mpm@...enic.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: slub/debugobjects: lockup when freeing memory

On Wed, Aug 20, 2014 at 01:01:19AM -0500, Christoph Lameter wrote:
> On Tue, 19 Aug 2014, Paul E. McKenney wrote:
> 
> > > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head
> > > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined.
> >
> > And indeed they are, good point!  It appears to me that both sets of
> > #ifdefs can go away.
> 
> Ok then this is a first workable version I think. How do we test this?

It looks good to me.

Sasha, could you please try this out?  This should fix the problem
you reported here:  https://lkml.org/lkml/2014/6/19/306

							Thanx, Paul

> From: Christoph Lameter <cl@...ux.com>
> Subject: slub: Add init/destroy function calls for rcu_heads
> 
> In order to do proper debugging for rcu_head use we need some
> additional structures allocated when an object potentially
> using a rcu_head is allocated in the slub allocator.
> 
> This adds the proper calls to init_rcu_head()
> and destroy_rcu_head().
> 
> init_rcu_head() is a bit of an unusual function since:
> 1. It does not touch the contents of the rcu_head. This is
>    required since the rcu_head is only used during
>    slab_page freeing. Outside of that the same memory location
>    is used for slab page list management. However, the
>    initialization occurs when the slab page is initially allocated.
>    So in the time between init_rcu_head() and destroy_rcu_head()
>    there may be multiple uses of the indicated address as a
>    list_head.
> 
> 2. It is called without gfp flags and could potentially
>    be called from atomic contexts. Allocations from init_rcu_head()
>    context need to deal with this.
> 
> 3. init_rcu_head() is called from within the slab allocation
>    functions. Since init_rcu_head() calls the allocator again
>    for more allocations it must avoid to use slabs that use
>    rcu freeing. Otherwise endless recursion may occur
>    (We may have to convince lockdep that what we do here is sane).
> 
> Signed-off-by: Christoph Lameter <cl@...ux.com>
> 
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c
> +++ linux/mm/slub.c
> @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa
>  	return page;
>  }
> 
> +#define need_reserve_slab_rcu						\
> +	(sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head))
> +
> +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page)
> +{
> +	if (need_reserve_slab_rcu) {
> +		int order = compound_order(page);
> +		int offset = (PAGE_SIZE << order) - s->reserved;
> +
> +		VM_BUG_ON(s->reserved != sizeof(struct rcu_head));
> +		return page_address(page) + offset;
> +	} else {
> +		/*
> +		 * RCU free overloads the RCU head over the LRU
> +		 */
> +		return (void *)&page->lru;
> +	}
> +}
> +
>  static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  {
>  	struct page *page;
> @@ -1357,6 +1376,29 @@ static struct page *allocate_slab(struct
>  			kmemcheck_mark_unallocated_pages(page, pages);
>  	}
> 
> +	if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page)
> +		/*
> +		 * Initialize various things. However, this init is
> +	 	 * not allowed to modify the contents of the rcu head.
> +		 * The allocator typically overloads the rcu head over
> +		 * page->lru which is also used to manage lists of
> +		 * slab pages.
> +		 *
> +		 * Allocations are permitted in init_rcu_head().
> +		 * However, the use of the same cache or another
> +		 * cache with SLAB_DESTROY_BY_RCU set will cause
> +		 * additional recursions.
> +		 *
> +		 * So in order to be safe the slab caches used
> +		 * in init_rcu_head() should be restricted to be of the
> +		 * non rcu kind only.
> +		 *
> +		 * Note also that no GFPFLAG is passed. The function
> +		 * may therefore be called from atomic contexts
> +		 * and somehow(?) needs to do the right thing.
> +		 */
> +		init_rcu_head(get_rcu_head(s, page));
> +
>  	if (flags & __GFP_WAIT)
>  		local_irq_disable();
>  	if (!page)
> @@ -1452,13 +1494,11 @@ static void __free_slab(struct kmem_cach
>  	memcg_uncharge_slab(s, order);
>  }
> 
> -#define need_reserve_slab_rcu						\
> -	(sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head))
> -
>  static void rcu_free_slab(struct rcu_head *h)
>  {
>  	struct page *page;
> 
> +	destroy_rcu_head(h);
>  	if (need_reserve_slab_rcu)
>  		page = virt_to_head_page(h);
>  	else
> @@ -1469,24 +1509,9 @@ static void rcu_free_slab(struct rcu_hea
> 
>  static void free_slab(struct kmem_cache *s, struct page *page)
>  {
> -	if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) {
> -		struct rcu_head *head;
> -
> -		if (need_reserve_slab_rcu) {
> -			int order = compound_order(page);
> -			int offset = (PAGE_SIZE << order) - s->reserved;
> -
> -			VM_BUG_ON(s->reserved != sizeof(*head));
> -			head = page_address(page) + offset;
> -		} else {
> -			/*
> -			 * RCU free overloads the RCU head over the LRU
> -			 */
> -			head = (void *)&page->lru;
> -		}
> -
> -		call_rcu(head, rcu_free_slab);
> -	} else
> +	if (unlikely(s->flags & SLAB_DESTROY_BY_RCU))
> +		call_rcu(get_rcu_head(s, page), rcu_free_slab);
> +	else
>  		__free_slab(s, page);
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/