[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110720082640.GZ7561@ics.muni.cz>
Date: Wed, 20 Jul 2011 10:26:40 +0200
From: Lukas Hejtmanek <xhejtman@....muni.cz>
To: k-ueda@...jp.nec.com
Cc: agk@...hat.com, linux-kernel@...r.kernel.org
Subject: request baset device mapper in Linux
Hi,
I encouter serious problems with you commit
cec47e3d4a861e1d942b3a580d0bbef2700d2bb2 introducing request based device
mapper in Linux.
I got machine with 80 SATA disks connected to two LSI SAS 2.0 controller
(mpt2sas driver).
All disks are configured as multipath devices in failover mode:
defaults {
udev_dir /dev
polling_interval 10
selector "round-robin 0"
path_grouping_policy failover
path_checker directio
rr_min_io 100
no_path_retry queue
user_friendly_names no
}
if I run the following command, ksoftirqd eats 100% CPU as soon as all
available memory is used for buffers.
for i in `seq 0 79`; do dd if=/dev/dm-$i of=/dev/null bs=1M count=10000 & done
top looks like this:
Mem: 48390M total, 45741M used, 2649M free, 43243M buffers
Swap: 0M total, 0M used, 0M free, 1496M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12 root 20 0 0 0 0 R 96 0.0 0:38.78 ksoftirqd/4
17263 root 20 0 9432 1752 616 R 14 0.0 0:03.19 dd
17275 root 20 0 9432 1756 616 D 14 0.0 0:03.16 dd
17271 root 20 0 9432 1756 616 D 10 0.0 0:02.60 dd
17258 root 20 0 9432 1756 616 D 7 0.0 0:02.67 dd
17260 root 20 0 9432 1756 616 D 7 0.0 0:02.47 dd
17262 root 20 0 9432 1752 616 D 7 0.0 0:02.38 dd
17264 root 20 0 9432 1756 616 D 7 0.0 0:02.42 dd
17267 root 20 0 9432 1756 616 D 7 0.0 0:02.35 dd
17268 root 20 0 9432 1756 616 D 7 0.0 0:02.45 dd
17274 root 20 0 9432 1756 616 D 7 0.0 0:02.47 dd
17277 root 20 0 9432 1756 616 D 7 0.0 0:02.53 dd
17261 root 20 0 9432 1756 616 D 7 0.0 0:02.36 dd
17265 root 20 0 9432 1756 616 R 7 0.0 0:02.47 dd
17266 root 20 0 9432 1756 616 R 7 0.0 0:02.44 dd
17269 root 20 0 9432 1756 616 D 7 0.0 0:02.62 dd
17270 root 20 0 9432 1756 616 D 7 0.0 0:02.46 dd
17272 root 20 0 9432 1756 616 D 7 0.0 0:02.36 dd
17273 root 20 0 9432 1756 616 D 7 0.0 0:02.46 dd
17276 root 20 0 9432 1752 616 D 7 0.0 0:02.36 dd
17278 root 20 0 9432 1752 616 D 7 0.0 0:02.44 dd
17259 root 20 0 9432 1752 616 D 6 0.0 0:02.37 dd
It looks like device mapper produces long SG lists and end_clone_bio() has
someting like quadratic complexity.
The problem can be workarounded using:
for i in /sys/block/dm-*; do echo 128 > $i/queue/max_sectors_kb; done
to short SG lists.
I use SLES 2.6.32.36-0.5-default kernel.
Using iostat -x, I can see there is about 25000 rrmq/s, while there is only
180 r/s, so it looks like each bio contains more then 100 requests which makes
serious troubles for ksoftirqd call backs.
Without the mentioned workeround, I got only 600MB/s sum of all dd readers.
With workernoud, I got about 2.8GB/s sum of all dd readers.
--
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists