lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <3FF04DCD-7CE4-486A-92F5-2337BC64AE50@dilger.ca>
Date:	Tue, 8 May 2012 11:02:19 -0600
From:	Andreas Dilger <adilger@...ger.ca>
To:	Daniel Pocock <daniel@...ock.com.au>
Cc:	Martin Steigerwald <ms@...mix.de>,
	Martin Steigerwald <Martin@...htvoll.de>,
	linux-ext4@...r.kernel.org
Subject: Re: ext4, barrier, md/RAID1 and write cache

On 2012-05-08, at 9:28 AM, Daniel Pocock wrote:
> My impression is that the faster performance of the USB disk was a red
> herring, and the problem really is just the nature of the NFS protocol
> and the way it is stricter about server-side caching (when sync is
> enabled) and consequently it needs more iops.
> 
> I've turned two more machines (a HP Z800 with SATA disk and a Lenovo
> X220 with SSD disk) into NFSv3 servers, repeated the same tests, and
> found similar performance on the Z800, but 20x faster on the SSD (which
> can support more IOPS)

Another possible option is to try "-o data=journal" for the ext4
filesystem.  This will, in theory, turn your random IO workload to
the filesystem into a streaming IO workload to the journal.  This
is only useful if the filesystem is not continually busy, and needs
a large enough journal (and enough RAM to match) to handle the burst
IO loads.

For example, if you are writing 1GB of data you need a 4GB journal
size and 4GB of RAM to allow all of the data to burst into the journal
and write into the filesystem asynchronously.  It it would also be
interesting to see if there is a benefit from running with an external
journal (possibly on a separate disk or an SSD), because then the
synchronous part of the IO does not seek, and then the small IOs can
be safely written to the filesystem asynchronously (they will be
rewritten from the journal if the server crashes).

Typically, data=journal mode will decrease I/O performance by 1/2,
since all data is written twice, but in your case NFS is hurting the
performance far more than this, so the extra "overhead" may still
give better performance visible to the clients.

>>> All the iostat output is typically like this:
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
>>> avgrq-sz avgqu-sz   await  svctm  %util
>>> dm-23             0.00     0.00    0.20  187.60     0.00     0.81
>>> 8.89     2.02   10.79   5.07  95.20
>>> dm-23             0.00     0.00    0.20  189.80     0.00     0.91
>>> 9.84     1.95   10.29   4.97  94.48
>>> dm-23             0.00     0.00    0.20  228.60     0.00     1.00
>>> 8.92     1.97    8.58   4.10  93.92
>>> dm-23             0.00     0.00    0.20  231.80     0.00     0.98
>>> 8.70     1.96    8.49   4.06  94.16
>>> dm-23             0.00     0.00    0.20  229.20     0.00     0.94
>>> 8.40     1.92    8.39   4.10  94.08
>> 
>> Hmmm, disk looks quite utilitzed. Are there other I/O workloads on the 
>> machine?
> 
> No, just me testing it

Looking at these results, the average IO size is very small.  Looking
at the writes/second of around 210w/s and the write bandwidth of 1MB/s,
this is only an average write size of only 4.5kB.

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ