[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170530172735.phicnjvr6ruo7grr@kernel.org>
Date: Tue, 30 May 2017 10:27:35 -0700
From: Shaohua Li <shli@...nel.org>
To: Ben Hutchings <ben.hutchings@...ethink.co.uk>
Cc: Dennis Yang <dennisyang@...p.com>, NeilBrown <neilb@...e.com>,
Shaohua Li <shli@...com>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH 4.4 018/103] md: update slab_cache before releasing new
stripes when stripes resizing
On Tue, May 30, 2017 at 02:16:53PM +0100, Ben Hutchings wrote:
> On Tue, 2017-05-23 at 22:08 +0200, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Dennis Yang <dennisyang@...p.com>
> >
> > commit 583da48e388f472e8818d9bb60ef6a1d40ee9f9d upstream.
> >
> > When growing raid5 device on machine with small memory, there is chance that
> > mdadm will be killed and the following bug report can be observed. The same
> > bug could also be reproduced in linux-4.10.6.
> [...]
> > The problem is that resize_stripes() releases new stripe_heads before assigning new
> > slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called
> > after resize_stripes() starting releasing new stripes but right before new slab cache
> > being assigned, it is possible that these new stripe_heads will be freed with the old
> > slab_cache which was already been destoryed and that triggers this bug.
> [...]
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -2232,6 +2232,10 @@ static int resize_stripes(struct r5conf
> > err = -ENOMEM;
> >
> > mutex_unlock(&conf->cache_size_mutex);
> > +
> > + conf->slab_cache = sc;
> > + conf->active_name = 1-conf->active_name;
> > +
> > /* Step 4, return new stripes to service */
> > while(!list_empty(&newstripes)) {
> > nsh = list_entry(newstripes.next, struct stripe_head, lru);
> [...]
>
> The assignments are still being done after conf->cache_size_mutex is
> unlocked, so there still seems to be a race with raid5_cache_scan().
> Shouldn't they be moved above the mutex_unlock()?
Unnecessary. The raid5_cache_scan can't free any stripe to slab_cache before
the stripe is called with raid5_release_stripe.
Thanks,
SHaohua
Powered by blists - more mailing lists