[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110309192642.GA4098@localhost>
Date: Wed, 9 Mar 2011 20:26:42 +0100
From: Johan Hovold <jhovold@...il.com>
To: NeilBrown <neilb@...e.de>
Cc: Greg Kroah-Hartman <gregkh@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: MD-raid broken in 2.6.37.3?
On Wed, Mar 09, 2011 at 09:02:51PM +1100, NeilBrown wrote:
> On Wed, 9 Mar 2011 10:06:22 +0100 Johan Hovold <jhovold@...il.com> wrote:
>
> > Hi Greg and Neil,
> >
> > I updated from 2.6.37.2 to 2.6.37.3 yesterday only to find that my
> > raid-0 partitions are no longer recognised. The raid-1 ones still are,
> > though. They did not show up after a reboot. (It has happened once
> > fairly recently that these exact partitions were not recognised but a
> > reboot fixed it -- blamed my disks.)
> >
> > Today I mistakenly booted into 2.6.37.3 again -- still missing. No
> > problems with 2.6.37.2.
> >
> > Browsing the changelog I found f663ed60892c3e1d4490b079a45d9e546271c40c
> > (md: Fix - again - partition detection when array becomes active) and
> > other md-related changes so I figure one of these could perhaps be to
> > blame?
> >
> > As it is my personal/production machine I feel uncomfortable bisecting
> > this at this point, but maybe Neil has an idea of what might be going
> > on?
>
> Hi Johan,
>
> could you please be a bit more specific about the problem that you
> experienced.
> What, exactly, was "no longer recognised"?
>
> Was it that the array (e.g. /dev/md1) didn't appear, or was it that the
> array did appear, but that it has a partition table, and the partitions
> (e.g. /dev/md1p1, /dev/md1p2) did not appear?
It's the whole array that is missing. The raid-1 arrays appear but the
raid-0 does not.
> If you still have the boot-log from when you booted 2.6.37.3 (or can
> recreated) and can get a similar log for 2.6.37.2, then it might be useful to
> compare them.
Attaching two boot logs for 2.6.37.3 with /dev/md6 missing, and one for
2.6.37.2.
Note that md1, md2, and md3 have v0.90 superblocks, whereas md5 and md6 have
v1.20 ones and are assembled later.
When /dev/md6 is successfully assembled, through the gentoo init scripts
calling "mdadm -As", the log contains:
messages.2:Mar 8 20:44:19 xi kernel: md: bind<sda6>
messages.2:Mar 8 20:44:19 xi kernel: md: bind<sda5>
messages.2:Mar 8 20:44:19 xi kernel: md: bind<sdb5>
messages.2:Mar 8 20:44:19 xi kernel: md: bind<sdb6>
and when it fails, either the sda6 or sdb6 bind is missing:
messages.3-1:Mar 8 20:04:39 xi kernel: md: bind<sda6>
messages.3-1:Mar 8 20:04:39 xi kernel: md: bind<sdb5>
messages.3-1:Mar 8 20:04:39 xi kernel: md: bind<sda5>
messages.3-2:Mar 8 20:41:09 xi kernel: md: bind<sdb6>
messages.3-2:Mar 8 20:41:09 xi kernel: md: bind<sdb5>
messages.3-2:Mar 8 20:41:09 xi kernel: md: bind<sda5>
I mentioned that something similar had happened before, but that a
reboot fixed it. Tonight I cannot seem to be able to reproduce the
issue, so it's could very well be that the problem lies elsewhere and
that only slightly changed timings or such made it appear three times in
a row in the three first 2.6.37.3 boots (with 2.6.37.2 working in
between)...
Thanks,
Johan
View attachment "messages.3-1" of type "text/plain" (7762 bytes)
View attachment "messages.3-2" of type "text/plain" (7762 bytes)
View attachment "messages.2" of type "text/plain" (9821 bytes)
Powered by blists - more mailing lists