[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7cwexr7tLRIOlNx@bender.morinfr.org>
Date: Thu, 20 Feb 2025 14:39:07 +0100
From: Guillaume Morin <guillaume@...infr.org>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: Guillaume Morin <guillaume@...infr.org>, linux-raid@...r.kernel.org,
linux-kernel@...r.kernel.org, song@...nel.org,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [BUG] possible race between md_free_disk and md_notify_reboot
On 20 Feb 19:55, Yu Kuai wrote:
>
> > I just take a quick look, the problem looks obviously to me, see how
> > md_seq_show() handle the iteration.
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 465ca2af1e6e..7c7a58f618c1 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -9911,8 +9911,11 @@ static int md_notify_reboot(struct notifier_block
> > *this,
> > mddev_unlock(mddev);
> > }
> > need_delay = 1;
> > - mddev_put(mddev);
> > - spin_lock(&all_mddevs_lock);
> > +
> > + spin_lock(&all_mddevs_lock)
> > + if (atomic_dec_and_test(&mddev->active))
> > + __mddev_put(mddev);
> > +
> > }
> > spin_unlock(&all_mddevs_lock);
>
> While cooking the patch, this is not enough, list_for_each_entry_safe()
> should be replaced with list_for_each_entry() as well.
>
> Will send the patch soon, with:
>
> Reported-by: Guillaume Morin <guillaume@...infr.org>
Thank you! I just saw the patch and we are going to test it and let you
know.
The issue with the next pointer seems to be fixed with your change.
Though I am still unclear how the 2nd potential issue I mentioned -
where the current item would be freed concurrently by mddev_free() - is
prevented. I am not finding anything in the code that seems to prevent a
concurrent call to mddev_free() for the current item in the
list_for_each_entry() loop (and therefore accessing mddev after the
kfree()).
I understand that we are getting a reference through the active atomic
in mddev_get() under the lock in md_notify_reboot() but how is that
preventing mddev_free() from freeing the mddev as soon as we release the
all_mddevs_lock in the loop?
I am not not familiar with this code so I am most likely missing
osmething but if you had the time to explain, that would be very
helpful.
TIA
Guillaume.
--
Guillaume Morin <guillaume@...infr.org>
Powered by blists - more mailing lists