[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18503.9570.282729.926846@notabene.brown>
Date: Thu, 5 Jun 2008 09:29:38 +1000
From: Neil Brown <neilb@...e.de>
To: Dave Jones <davej@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: 2.6.25 md oops during boot.
On Thursday June 5, neilb@...e.de wrote:
>
> I wouldn't say this is a likely scenario as it requires (I think)
> kmalloc failure very early in boot. But I cannot see any other
> possible cause.
On closer inspection, I can see another possible cause. I don't think
it is likely (yet) but it might be possible.
If two threads enter md_probe for the same mddev, then the second one
to get disks_mutex could exit before the first had called
kobject_init_and_add, so it could make available an mddev where
kobj.sd was NULL.
I cannot imagine how two threads could be doing that so early in boot,
but I cannot rule it out.
This (untested) patch should close both these possible problems.
NeilBrown
-----------------
Fix error paths if md_probe fails.
md_probe can fail (e.g. alloc_disk could fail) without
returning an error (as it alway returns NULL).
So when we call mddev_find immediately afterwards, we need
to check that md_probe actually succeeded. This means checking
that mdev->gendisk is non-NULL.
Also there is a possible race - if two threads call md_probe
for the same device, then one could exit (having checked that
->gendisk exists) before the other has called kobject_init_and_add,
thus returning an incomplete kobj which is cause problems when
we try to add children to it.
So extend the range of protection of disks_mutex slightly to
avoid this possibility.
Cc: Dave Jones <davej@...hat.com>
Signed-off-by: Neil Brown <neilb@...e.de>
### Diffstat output
./drivers/md/md.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c 2008-06-03 16:35:41.000000000 +1000
+++ ./drivers/md/md.c 2008-06-05 09:19:56.000000000 +1000
@@ -3363,9 +3363,9 @@ static struct kobject *md_probe(dev_t de
disk->queue = mddev->queue;
add_disk(disk);
mddev->gendisk = disk;
- mutex_unlock(&disks_mutex);
error = kobject_init_and_add(&mddev->kobj, &md_ktype, &disk->dev.kobj,
"%s", "md");
+ mutex_unlock(&disks_mutex);
if (error)
printk(KERN_WARNING "md: cannot register %s/md - name in use\n",
disk->disk_name);
@@ -3935,8 +3935,10 @@ static void autorun_devices(int part)
md_probe(dev, NULL, NULL);
mddev = mddev_find(dev);
- if (!mddev) {
- printk(KERN_ERR
+ if (!mddev || !mddev->gendisk) {
+ if (mddev)
+ mddev_put(mddev);
+ printk(KERN_ERR
"md: cannot allocate memory for md drive.\n");
break;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists