linux-kernel - Re: SSD read latency negatively impacted by large writes (independent of choice of I/O scheduler)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <x49eiohqa06.fsf@segfault.boston.devel.redhat.com>
Date:	Mon, 02 Nov 2009 09:25:29 -0500
From:	Jeff Moyer <jmoyer@...hat.com>
To:	Zubin Dittia <zubin@...tri.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: SSD read latency negatively impacted by large writes (independent of  choice of I/O scheduler)

Zubin Dittia <zubin@...tri.com> writes:

> I've been doing some testing with an Intel X25-E SSD, and noticed that
> large writes can severely affect read latency, regardless of which I/O
> scheduler or scheduler parameters are in use (this is with kernel
> 2.6.28-16 from Ubuntu jaunty 9.04).  The test was very simple: I had
> two threads running; the first was in a tight loop reading different
> 4KB sized blocks (and recording the latency of each read) from the SSD
> block device file.  While the first thread is doing this, a second
> thread does a single big 5MB write to the device.  What I noticed is
> that about 30 seconds after the write (which is when the write is
> actually written back to the device from buffer cache), I see a very
> large spike in read latency: from 200 microseconds to 25 milliseconds.
>  This seems to imply that the writes issued by the scheduler are not
> being broken up into sufficiently small chunks with interspersed
> reads; instead, the whole sequential write seems to be getting issued
> while starving reads during that period.  I've noticed the same
> behavior with SSDs from another vendor as well, and there the latency
> impact was even worse (80 ms).  Playing around with different I/O
> schedulers and parameters doesn't seem to help at all.
>
> The same behavior is exhibited when using O_DIRECT as well (except
> that the latency hit is immediate instead of 30 seconds later, as one
> would expect).  The only way I was able to reduce the worst-case read
> latency was by using O_DIRECT and breaking up the large write into
> multiple smaller writes (with one system call per smaller write).  My
> theory is that the time between write system calls was enough to allow
> reads to squeeze themselves in between the writes.  But, as would be
> expected, this does bad things to the sequential write throughput
> because of the overhead of multiple system calls.
>
> My question is: have others seen this behavior?  Are there any
> tunables that could help (perhaps a parameter that would dictate the
> largest size of a write that can be pending to the device at any given
> time).  If not, would it make sense to implement a new I/O scheduler
> (or hack an existing one) which does this.

I haven't verified your findings, but if what you state is true, then
you could try tuning max_sectors_kb for your device.  Making that
smaller will decrease the total amount of I/O that can be queued in the
device at any given time.  There's always a trade-off between bandwidth
and latency, of course.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/