lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 16 Oct 2015 10:58:01 -0600
From:	Andreas Dilger <adilger@...ger.ca>
To:	Jean-Paul Smeets <jean-paul.smeets@...l.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: High write latency spikes when writing to EXT4

On Oct 16, 2015, at 2:07 AM, Jean-Paul Smeets <jean-paul.smeets@...l.com> wrote:
> In our application we are seeing occasional high latency spikes (500ms range
> )when writing small amounts of data (a couple MB, a couple of 100 IOs) to
> files on an EXT4 volume (data=ordered). We are running kernel 3.10.62.
> 
> What seems to be happening is that a journal commit takes a long time.
> During that time no write access to the journal is possible.
> So a subsequent vfs.write is blocked until the journal commit is finished.
> 
> We expected the page cache to decouple the vfs.write from this and not to
> block the vfs.write.
> This behavior causes large spikes in on average much lower write latencies.
> 
> Small log excerpt (duration in us):
> 16:18:56.966387, PROCA, 20234, __ext4_journal_get_write_access: duration: 140412
> 16:18:56.966406, PROCB, 19965, __ext4_journal_get_write_access: duration: 461892
> 16:18:56.966422, PROCC, 20387, __ext4_journal_get_write_access: duration: 391105
> 16:18:56.966439, PROCD, 20041, __ext4_journal_get_write_access: duration: 64256
> 16:18:56.966454, itheex, 19823, __ext4_journal_get_write_access: duration:
> 383671
> 16:18:56.966718, jbd2/dm-8-8, 684, jbd2_journal_commit_transaction:
> duration: 468959
> 16:18:56.967426, PROCE, 19867, ext4_file_write: duration: 469431
> 16:18:56.967442, PROCE, 19867, vfs.write: duration: 469452
> 
> A similar, but less frequent issue is with the __ext4_journal_start_sb.
> Small log excerpt:
> 16:17:43.337004, PROCF, 20041, __ext4_journal_start_sb: duration: 240753
> 16:17:43.337024, PROCG, 19980, __ext4_journal_start_sb: duration: 269051
> 16:17:43.337498, PROCG, 19980, ext4_file_write: duration: 269531
> 16:17:43.337515, PROCG, 19980, vfs.write: duration: 269554
> 
> Can anything be done about these very high latency spikes ?

There are a number of options to disconnect the file write operations from the journal commits:
- preallocate file blocks with fallocate() so that there are no allocations
  during the write
- use data=writeback so that writes are not ordered with journal commits

It is also possible to store the journal on a separate low-latency device
like SSD or PCI NVRAM so that the journal commits are very fast and do not
cause latency spikes during writes.

Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists