linux-kernel - Re: [PATCH 4.4 018/103] md: update slab_cache before releasing new stripes when stripes resizing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1496150213.2083.55.camel@codethink.co.uk>
Date:   Tue, 30 May 2017 14:16:53 +0100
From:   Ben Hutchings <ben.hutchings@...ethink.co.uk>
To:     Dennis Yang <dennisyang@...p.com>, NeilBrown <neilb@...e.com>,
        Shaohua Li <shli@...com>
Cc:     linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH 4.4 018/103] md: update slab_cache before releasing new
 stripes when stripes resizing

On Tue, 2017-05-23 at 22:08 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Dennis Yang <dennisyang@...p.com>
> 
> commit 583da48e388f472e8818d9bb60ef6a1d40ee9f9d upstream.
> 
> When growing raid5 device on machine with small memory, there is chance that
> mdadm will be killed and the following bug report can be observed. The same
> bug could also be reproduced in linux-4.10.6.
[...]
> The problem is that resize_stripes() releases new stripe_heads before assigning new
> slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called
> after resize_stripes() starting releasing new stripes but right before new slab cache
> being assigned, it is possible that these new stripe_heads will be freed with the old
> slab_cache which was already been destoryed and that triggers this bug.
[...]
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -2232,6 +2232,10 @@ static int resize_stripes(struct r5conf
>  		err = -ENOMEM;
>  
>  	mutex_unlock(&conf->cache_size_mutex);
> +
> +	conf->slab_cache = sc;
> +	conf->active_name = 1-conf->active_name;
> +
>  	/* Step 4, return new stripes to service */
>  	while(!list_empty(&newstripes)) {
>  		nsh = list_entry(newstripes.next, struct stripe_head, lru);
[...]

The assignments are still being done after conf->cache_size_mutex is
unlocked, so there still seems to be a race with raid5_cache_scan().
Shouldn't they be moved above the mutex_unlock()?

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.