lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C0E13A7.20402@msgid.tls.msk.ru>
Date:	Tue, 08 Jun 2010 13:55:51 +0400
From:	Michael Tokarev <mjt@....msk.ru>
To:	Linux-kernel <linux-kernel@...r.kernel.org>
Subject: xfs, aacraid 2.6.27 => 2.6.32 results in 6 times slowdown

Hello.

I've got a.. difficult issue here, and am asking if anyone else
has some expirence or information about it.

Production environment (database).  Machine with an Adaptec
RAID SCSI controller, 6 drives in raid10 array, XFS filesystem
and Oracle database on top of it (with - hopefully - proper
sunit/swidth).

Upgrading kernel from 2.6.27 to 2.6.32, and users starts screaming
about very bad performance.  Iostat reports increased I/O latencies,
I/O time increases from ~5ms to ~30ms.  Switching back to 2.6.27,
and everything is back to normal (or, rather, usual).

I tried testing I/O with a sample program which performs direct random
I/O on a given device, and all speeds are actually better in .32
compared with .27, except of random concurrent r+w test, where .27
gives a bit more chances to reads than .32.  Looking at the synthetic
tests I'd expect .32 to be faster, but apparently it is not.

This is only one machine here which is still running 2.6.27, all the
rest are upgraded to 2.6.32, and I see good performance of .32 there.
But this is also the only machine with hardware raid controller, which
is onboard and hence not easy to get rid of, so I'm sorta forced to
use it (I prefer software raid solution because of numerous reasons).

One possible cause of this that comes to mind is block device write
barriers.  But I can't find when they're actually implemented.

The most problematic issue here is that this is only one machine that
behaves like this, and it is a production server, so I've very little
chances to experiment with it.

So before the next try, I'd love to have some suggestions about what
to look for.   In particular, I think it's worth the effort to look
at write barriers, but again, I don't know how to check if they're
actually being used.

Anyone have suggestions for me to collect and to look at?

Thank you!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ