[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALS80vgNzNENo=mDSt2b0363UXdgnQ6GtyActdYNs=oZgunpOg@mail.gmail.com>
Date: Thu, 18 Jun 2015 09:35:35 +0200
From: Erik Cumps <erik.cumps@...turnus.com>
To: linux-kernel@...r.kernel.org
Cc: Erik Cumps <erik.cumps@...turnus.com>
Subject: Re: Unexpected slow block device write IO performance compared to
uncached, unsynced direct IO using stock kernels
On Tue, Jun 16, 2015 at 3:56 PM, Erik Cumps <erik.cumps@...turnus.com> wrote:
>
> we are noticing some strange block device IO performance and our
> investigations are leading us away from the hardware and towards the
> kernel. This could be a simple tuning problem or a known issue, so
> before taking a deep dive down kernel sources and debian kernel patches
> we would like to rule out the simple things first.
>
> The context is a 16 GB 32-bit intel debian workstation, using an ext4
> filesystem with journalling, on a lvm SATA3 SSD disk, with relatively
> recent stock kernels from 3.2 onwards to 4.0, running some KVM virtual
> machines. The host system (so not the virual machines) shows sporadic
> extremely slow write performance (around 4 megabytes per second).
> However, if we use the debian 3.2.0 kernel this problem does not
> manifest itself.
>
> We've created a simple IO performance test script to investigate this.
> It is basically a smart wrapper around dd, copying data between the
> block device under test and a ramdisk filesystem, using flags to select
> the usage of cache, sync and direct io, clearing caches before the test
> and running the write and read tests three times to account for
> transient performance.
>
> The results of these tests are unexpected: we see the expected normal
> write performance when using uncached, unsynced, direct IO and very slow
> write performance using the regular cached IO. The difference is huge:
> it is sometimes two orders of magnitude!
Actually, it is the *synchronous*, direct IO that matches the expected
raw write performance of the device.
The "regular IO" test is doing roughly this:
echo 3 > /proc/sys/vm/drop_caches
dd if=ramdisk_file of=test_file bs=1M count=100
dd if=ramdisk_file of=test_file bs=1M count=100
dd if=ramdisk_file of=test_file bs=1M count=100
The direct IO test is doing roughly this:
echo 3 > /proc/sys/vm/drop_caches
dd if=ramdisk_file of=test_file oflag=sync,direct bs=1M count=100
dd if=ramdisk_file of=test_file oflag=sync,direct bs=1M count=100
dd if=ramdisk_file of=test_file oflag=sync,direct bs=1M count=100
> This seems to rule out the block device and IO controller being at
> fault. In fact, other tests showed the same performance discrepancy with
> an NFS mounted filesystem and a platter disk.
>
> We also noticed that when the performance is slow, if we shut down the
> KVM virtual machines the performance returns to normal.
>
> Maybe there is something going wrong with cache/buffer handling?
> Thanks for your insights.
>
> I've kept this mail intentionally free from too many technical details
> but I'll be happy to provide additional relevant info as required.
Regards,
Erik Cumps
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists