lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 Jul 2012 22:15:03 -0400
From:	David Dillow <dave@...dillows.org>
To:	Kevin Ross <kevin@...ilyross.net>
Cc:	Phil Turmel <philip@...mel.org>, linux-kernel@...r.kernel.org,
	linux-raid <linux-raid@...r.kernel.org>
Subject: Re: RAID extremely slow

On Wed, 2012-07-25 at 18:55 -0700, Kevin Ross wrote:
> On 07/25/2012 06:00 PM, Phil Turmel wrote:
> > Piles of small reads  scattered across multiple drives, and a
> > concentration of queued writes to /dev/sda.  What's on /dev/sda?
> > It's not a member of the raid, so it must be some other system task
> > involved.

> After rebooting, MythTV is currently recording two shows, and the resync 
> is running at full speed.
> 
> # cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid6 sdh1[0] sdd1[9] sde1[10] sdb1[6] sdi1[7] sdc1[4] 
> sdf1[3] sdg1[8] sdj1[1]
>        6837311488 blocks super 1.2 level 6, 512k chunk, algorithm 2 
> [9/9] [UUUUUUUUU]
>        [=>...................]  resync =  9.3% (91363840/976758784) 
> finish=1434.3min speed=10287K/sec
> 
> unused devices: <none>
> 
> atop shows the avio of all the drives to be less than 1ms, where before 
> they were much higher.  It will run for a couple days under load just 
> fine, and then it will come to a halt.
> 
> It's a 3.2.21 kernel.  I'm running Debian Testing, and the exact Debian 
> package version is:

I suspect you are being hit by same bug I was -- delayed stripes never
got processed. If you get into the state where the rebuild isn't
progressing, and you find that increasing the size of the stripe cache
allows the rebuild to proceed (but the filesystem stays wedged), then
that cinches it.

If you can, upgrade to the latest 3.4 stable kernel (3.4.6 right now).
As far as I can see, the latest 3.2 stable does not contain the delayed
stripe fix.

After applying those fixes to my kernel, my MythTV setup over a 5 disk
RAID5 has been pretty solid, where before I was getting lockups every
few days. It still seems to be getting slower over time, but I've not
looked into it yet as it is not as catastrophic as the wedging.

HTH,
Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists