linux-kernel - Re: Software RAID partitions not detected at boot time with linux 2.6.25.1.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18461.45667.588613.443564@notabene.brown>
Date:	Sun, 4 May 2008 22:56:03 +1000
From:	Neil Brown <neilb@...e.de>
To:	Joao Luis Meloni Assirati <assirati@...ada.if.usp.br>,
	Kay Sievers <kay.sievers@...y.org>,
	Greg Kroah-Hartman <gregkh@...e.de>
Cc:	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org
Subject: Re: Software RAID partitions not detected at boot time with linux 2.6.25.1.

On Saturday May 3, assirati@...ada.if.usp.br wrote:
> Let's try again, this this time with a proper e-mail subject.
> 
> Hello,
> 
> The problem I reported in
> http://marc.info/?l=linux-kernel&m=120804001406787&w=2
> still persists at 2.6.25.1. (sorry for the repeated messages there; that was 
> my mail client's fault).
> 
> In my previous bug repor, I was using raid1; now, I am using raid5, as 
> follows.
> 
> I have three partitions /dev/sda2, /dev/sdb2 and /dev/sdc2 marked as raid 
> autodetect (FD). They are assembled as a partitionable raid 5 array, and my 
> root is in /dev/md_d0p1. With a 2.6.24.2 kernel with sata, ext3 and raid5 
> compiled in (I don't use initrd), the system boots fine. In fact, the raid 
> array and partitions are detected as show with dmesg:
> 
> md: Autodetecting RAID arrays.
> md: Scanned 3 and added 3 devices.
> md: autorun ...
> md: considering sdc2 ...
> md:  adding sdc2 ...
> md:  adding sdb2 ...
> md:  adding sda2 ...
> md: created md_d0
> md: bind<sda2>
> md: bind<sdb2>
> md: bind<sdc2>
> md: running: <sdc2><sdb2><sda2>
> raid5: device sdc2 operational as raid disk 2
> raid5: device sdb2 operational as raid disk 1
> raid5: device sda2 operational as raid disk 0
> raid5: allocated 3226kB for md_d0
> raid5: raid level 5 set md_d0 active with 3 out of 3 devices, algorithm 2
> RAID5 conf printout:
>  --- rd:3 wd:3
>  disk 0, o:1, dev:sda2
>  disk 1, o:1, dev:sdb2
>  disk 2, o:1, dev:sdc2
> md: ... autorun DONE.
>  md_d0: p1 p2 p3 p4 < p5 p6 p7 >
> 
> The last line shows my raid partitions are detected.
> 
> However, when I try kernel 2.6.25.1, again with sata, ext3 and raid5 compiled 
> in and no initrd, the kernel does not recognize the raid partitions at boot 
> time, nonetheless the array is dected. The output of dmesg is the same  until:
> 
> raid5: allocated 3224kB for md_d0
> raid5: raid level 5 set md_d0 active with 3 out of 3 devices, algorithm 2
> RAID5 conf printout:
>  --- rd:3 wd:3
>  disk 0, o:1, dev:sda2
>  disk 1, o:1, dev:sdb2
>  disk 2, o:1, dev:sdc2
> md: ... autorun DONE.
> VFS: Cannot open root device "md_d0p1" or unknown-block(0,0)
> Please append a coorect "root=" boot option; here are the available partitions:
> 0800  312571224 sda driver:sd
>   0801      96358 sda1
>   0802  312472282 sda2
> 0810  312571224 sdb driver:sd
>   0811      96358 sdb1
>   0812  312472282 sdb2
> 0820  312571224 sdc driver:sd
>   0821      96358 sdc1
>   0822  312472282 sdc2
> fe00  624944384 md_d0 (driver?)
> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
> 
> Note the absence of the line
>  md_d0: p1 p2 p3 p4 < p5 p6 p7 >
> showing partition detection.
> 

It appears this regression was introduced by commit
   edfaa7c36574f1bf09c65ad602412db9da5f96bf
   Driver core: convert block from raw kobjects to core devices

Previously, the device name "md_d0p1" would be parsed out as "md_d0"
and partition 1.
md_d0 would be found and the device number of the first partition
could then be deduced even though the partition table hadn't been
processed at this point, so 'p1' wasn't actually known.
The kernel would attempt to open 'p1', this would trigger a read of
the partition table and the open would be successful.

The new code is 'simpler'.  It doesn't try to parse the device name at
all.  It just compares the device name against the names of all known
devices.  As the partitions are not known at this time, md_d0p1 is not
found.

There seem to be two ways this could be fixed:
1/ restore the old behaviour of parsing out the partition information.
   This would be least likely to leave other regressions waiting to be
   found, but might be seen as somewhat ugly.

2/ Get md_d0 to read it's partition table earlier.  I've tried this
   before without much luck.
   There are three times that the partition table can be parsed.
   a/ When the device is opened if bd_invalidated is set.
   b/ When the device is first registered.  register_disk calls
      blkdev_get which effectively opens the device, thus triggering
      'a' above.
   c/ When the BLKRRPART ioctl is made on the device.

   When an md device is first registered, there are no disks attached
   to it, so no partition table can be read.  The way md devices get
   set up, they are registered as empty devices, the component devices
   are attached, then the array is 'started'.  Only at this point can
   data, such as the partition table, be read.  But this it too late
   for case 'b' above.  Case 'a' is the one that usually causes
   partitions to be read.  'mdadm' explicitly opens devices after
   creating them to trigger this.

   We could conceivably put a BLKRRPART call in at the end of
   autorun_array where the array has just been started, but that would
   be rather ugly.  We would need to open the device, and doing that
   inside the kernel is never clean.  The code in 'init/*.c' for that
   sort of thing creates nodes in /dev in a temporary root filesystem.
   We cannot do that for autorun_array as it can be call long after
   boot time (but the RAID_AUTODETECT ioctl) so there may not be 
   a temporary filesystem to play with.

So I currently think that restoring the old behaviour of not requiring
partitions to exist before trying to open them would be best.

Kay?


NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/