linux-kernel - Large write = large latency for small writes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <72dbd3150903171319u567fc267m36857506c024315d@mail.gmail.com>
Date:	Tue, 17 Mar 2009 13:19:18 -0700
From:	David Rees <drees76@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: Large write = large latency for small writes

There is a fairly active bug 12309 "Large I/O operations result in
slow performance and high iowait times" [1] which has gotten rather
large an unwieldy due to the number of participants and confusion due
to what appears to be multiple bugs in either the schedule or IO
request merging layers.

I encountered this bug when seeing some horrible latency on a NFS
client while the server was performing heavy IO [2] and initially
thought it was a NFS bug, but later found that NFS only makes it worse
because writes over NFS are synchronous by default.

I have a simple test case which demonstrates the huge increase in
write latency that occurs for small writes when a large disk
saturating write is also in progress [3]:

dd if=/dev/zero of=/tmp/bigfile bs=1M count=10000 conv=fdatasync &
sleep 10
time dd if=/dev/zero of=/tmp/smallfile bs=4k count=1 conv=fdatasync

On a handful of systems I have access to, it took anywhere from 6-45
seconds for the small write to complete.  Others in the bug have
reproduced this across a number of filesystems (ext3, reiserfs, ext4).
xfs in particular seems to handle this test case better than the
others.  As do systems which can sustain high write speeds.

The only way I've been able to reduce the latency to acceptable levels
is to drop vm.dirty_background_ratio and vm.dirty_ratio to 1 and 2
respectively - but even then on some systems the small write still
takes 5-7 seconds.

The systems I've been testing on are all Fedora 9 or 10 systems with
the latest kernels (basically 2.6.27.19), ext3 filesystems and various
amounts of CPU, memory and disk subsystems (but nothing too new or
fast).

Any ideas?  How much latency is acceptable with the test case?

-Dave

[1] http://bugzilla.kernel.org/show_bug.cgi?id=12309
[2] http://marc.info/?l=linux-nfs&m=123697692631683&w=2
[3] http://bugzilla.kernel.org/show_bug.cgi?id=12309#c249
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/