[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070212000042.M73586@liquid-nexus.net>
Date: Mon, 12 Feb 2007 08:03:57 +0800
From: "Marc Marais" <marcm@...uid-nexus.net>
To: Neil Brown <neilb@...e.de>
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: md: md6_raid5 crash 2.6.20
On Mon, 12 Feb 2007 09:02:33 +1100, Neil Brown wrote
> On Sunday February 11, marcm@...uid-nexus.net wrote:
> > Greetings,
> >
> > I've been running md on my server for some time now and a few days ago one of
> > the (3) drives in the raid5 array starting giving read errors. The result was
> > usually system hangs and this was with kernel 2.6.17.13. I upgraded to the
> > latest production 2.6.20 kernel and experienced the same behaviour.
>
> System hangs suggest a problem with the drive controller. However
> this "kernel BUG" is something newly introduced in 2.6.20 which
> should be fixed in 2.6.20.1. Patch is below.
>
> If you still get hangs with this patch installed, then please report
> detail, and probably copy to linux-ide@...r.kernel.org.
>
> NeilBrown
>
> Fix various bugs with aligned reads in RAID5.
>
> It is possible for raid5 to be sent a bio that is too big
> for an underlying device. So if it is a READ that we
> pass stright down to a device, it will fail and confuse
> RAID5.
>
> So in 'chunk_aligned_read' we check that the bio fits within the
> parameters for the target device and if it doesn't fit, fall back
> on reading through the stripe cache and making lots of one-page
> requests.
>
> Note that this is the earliest time we can check against the device
> because earlier we don't have a lock on the device, so it could
> change underneath us.
>
> Also, the code for handling a retry through the cache when a read
> fails has not been tested and was badly broken. This patch fixes
> that code.
>
> Signed-off-by: Neil Brown <neilb@...e.de>
>
Thanks for the quick response Neil unfortunately the kernel doesn't build with
this patch due to a missing symbol:
WARNING: "blk_recount_segments" [drivers/md/raid456.ko] undefined!
Is that in another file that needs patching or within raid5.c?
Marc
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists