[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0F5B06BAB751E047AB5C87D1F77A778869941207A5@GVW0547EXC.americas.hpqcorp.net>
Date: Mon, 7 Dec 2009 16:32:36 +0000
From: "Miller, Mike (OS Dev)" <Mike.Miller@...com>
To: Ozan Çağlayan <ozan@...dus.org.tr>,
linux-kernel <linux-kernel@...r.kernel.org>
CC: "scameron@...rdog.cce.hp.com" <scameron@...rdog.cce.hp.com>,
"jens.axboe@...cle.com" <jens.axboe@...cle.com>
Subject: RE: CCISS performance drop in buffered disk reads in newer kernels
> -----Original Message-----
> From: Ozan Çağlayan [mailto:ozan@...dus.org.tr]
> Sent: Monday, December 07, 2009 4:46 AM
> To: linux-kernel
> Cc: scameron@...rdog.cce.hp.com; Miller, Mike (OS Dev);
> jens.axboe@...cle.com
> Subject: CCISS performance drop in buffered disk reads in
> newer kernels
>
> Hi,
>
> We have 2 HP Proliant DL380G5 server running with different kernels.
>
> I was inspecting a basic kernel-compile time. On the one with
> 2.6.25.20 kernel, the compilation took ~1.5 minutes. On the
> one with 2.6.30.9 kernel, it took ~6 minutes. Both systems
> are using ccache as a build helper.
>
> Then I ran hdparm on both systems, the results are below.
>
> I'd like to help debugging this issue through bisect or
> another method but since there are more parameters that
> differ from one to the other server than only the kernel
> version, I'm a little bit stuck.
>
> Thanks,
> Ozan
>
Ozan,
I'm aware of the performance drop. Please see: http://bugzilla.kernel.org/show_bug.cgi?id=13127. I removed the huge read ahead value of 1024 that we used because users were complaining about small writes being starved. That was back around the 2.6.25 timeframe. Since that timeframe there have no changes in the main i/o path. I'll get back on this as time allows.
Meanwhile, you can tweak some of the block layer tunables as such.
echo 64 > /sys/block/cciss\!c0d1/queue/read_ahead_kb
OR
blockdev --setra 128 /dev/cciss/c0d1
These are just example values. There is also max_hw_sectors_kb and max_sectors_kb that be adjusted.
-- mikem
>
> ### 2.6.30.9 (Slow one, compiled with PAE support, FS is ext4) ###
>
> # sync; sleep 2; echo 3 > /proc/sys/vm/drop_caches; hdparm -tT -vvvv
> /dev/cciss/c0d0p5
>
> /dev/cciss/c0d0p5:
> HDIO_DRIVE_CMD(identify) failed: Invalid exchange
> readonly = 0 (off)
> readahead = 256 (on)
> geometry = 245410/255/32, sectors = 2002550382, start = 4225158
> Timing cached reads: 12038 in 2.00 seconds = 6027.00 MB/sec
> Timing buffered disk reads: 184 MB in 3.00 seconds = 61.31
> MB/sec <------ Note the drop here!
>
> # dmesg | grep cciss
> [ 0.000000] Kernel command line: root=LABEL=PARDUS_ROOT vga=791
> splash=silent quiet resume=/dev/cciss/c0d0p1
> [ 6.023542] cciss 0000:18:08.0: PCI INT A -> GSI 19 (level, low) ->
> IRQ 19
> [ 6.023566] cciss: MSI init failed
> [ 6.053008] IRQ 19/cciss0: IRQF_DISABLED is not guaranteed
> on shared IRQs
> [ 6.053015] cciss0: <0x3238> at PCI 0000:18:08.0 IRQ 19 using DAC
> [ 6.053918] cciss/c0d0: p1 p2 < p5 >
> [ 6.320852] kjournald2 starting: pid 190, dev
> cciss!c0d0p5:8, commit
> interval 5 seconds
> [ 6.322344] EXT4-fs: mounted filesystem cciss!c0d0p5 with ordered
> data mode
> [ 10.994505] EXT4 FS on cciss!c0d0p5, internal journal on
> cciss!c0d0p5:8
> [ 11.783302] Adding 2112508k swap on /dev/cciss/c0d0p1. Priority:-1
> extents:1 across:2112508k
> [ 16.696090] JBD: barrier-based sync failed on cciss!c0d0p5:8 -
> disabling barriers
>
>
> ### 2.6.25.20 (Fast one, no PAE support, FS is ext3) ###
>
> # sync;sleep 2; echo 3 > /proc/sys/vm/drop_caches; hdparm -tT -vvv
> /dev/cciss/c0d0p5
>
> /dev/cciss/c0d0p5:
> readonly = 0 (off)
> readahead = 256 (on)
> geometry = 245426/255/32, sectors = 2002678902, start = 4096638
> Timing cached reads: 10650 MB in 2.00 seconds = 5334.38 MB/sec
> Timing buffered disk reads: 420 MB in 3.01 seconds = 139.72 MB/sec
>
> # dmesg | grep cciss
> Kernel command line: root=LABEL=PARDUS_ROOT vga=791
> splash=silent quiet
> resume=/dev/cciss/c0d0p1
> cciss0: <0x3238> at PCI 0000:18:08.0 IRQ 212 using DAC
> cciss/c0d0: p1 p2 < p5 >
> EXT3 FS on cciss/c0d0p5, internal journal Adding 2048248k
> swap on /dev/cciss/c0d0p1. Priority:-1 extents:1 across:2048248k
>
>
Received: from g1t0039.austin.hp.com (16.236.32.45) by
G3W0060.americas.hpqcorp.net (16.232.1.155) with Microsoft SMTP Server id
8.2.176.0; Mon, 7 Dec 2009 15:33:53 +0000
Received: from g2u0301c.austin.hp.com (g2u0301c.austin.hp.com [16.238.112.31])
by g1t0039.austin.hp.com (Postfix) with ESMTP id 3590C340A3 for
<MIKE.MILLER@...com>; Mon, 7 Dec 2009 15:29:58 +0000 (UTC)
Received: (from d7cadm@...alhost) by g2u0301c.austin.hp.com (@(#)Sendmail
version 8.13.3 - Revision 1.004 - 12 January 2007/8.11.1) id nB7FTwv8024947
for MIKE.MILLER@...COM; Mon, 7 Dec 2009 15:29:58 GMT
From: "Rodriguez, Jennifer (JCI)" <rodriguez.jennifer@...com>
To: "Miller, Mike (OS Dev)" <Mike.Miller@...com>
Date: Mon, 7 Dec 2009 15:29:58 +0000
Subject: Notification number 001000029587 put in process
Thread-Topic: Notification number 001000029587 put in process
Thread-Index: Acp3Uq8hXoZtgpVRTFuB18azS09K4g==
Message-ID: <ADR34000001442227D7C300@...com>
X-MS-Exchange-Organization-AuthAs: Internal
X-MS-Exchange-Organization-AuthMechanism: 10
X-MS-Exchange-Organization-AuthSource: G3W0060.americas.hpqcorp.net
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-authentication-warning: g2u0301c.austin.hp.com: d7cadm set sender to
rodriguez.jennifer@...com using -f
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Notification number 001000029587 put in process for user phone number Sho=
rt Desc Hot spot in lab M71B276
Powered by blists - more mailing lists