linux-ext4 - Re: Append and fsync performance in ext4 DAX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180429012007.GN5965@thunk.org>
Date:   Sat, 28 Apr 2018 21:20:08 -0400
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Vijay Chidambaram <vvijay03@...il.com>
Cc:     Ext4 <linux-ext4@...r.kernel.org>,
        Rohan Kadekodi <kadekodirohan@...il.com>,
        aasheesh kolli <aasheesh.kolli@...il.com>
Subject: Re: Append and fsync performance in ext4 DAX

On Sat, Apr 28, 2018 at 11:24:32AM -0500, Vijay Chidambaram wrote:
> 
> While we expect workload 1 to take more time than workload 2 since it
> is extending the file, 10x higher time seems suspicious. If we remove
> the fsync in workload 1, the running time drops to 3s. If we remove
> the fsync in workload 2, the running time is around the same (1.5s).

Can you mount the file system; run workload #N, and then once it's
done, capture the output of /dev/fs/jbd2/<dev>-8/info, which should
look like this:

% cat /proc/fs/jbd2/dm-1-8/info 
498438 transactions (498366 requested), each up to 65536 blocks
average: 
  0ms waiting for transaction
  0ms request delay
  470ms running transaction
  0ms transaction was being locked
  0ms flushing data (in ordered mode)
  0ms logging transaction
  2522us average transaction commit time
  161 handles per transaction
  14 blocks per transaction
  15 logged blocks per transaction

It would be interesting to see this for workload #1 and workload #2.

I will note that if you were using fdatasync(2) instead of fsync(2)
for workload #2, there wouldn't be any journal transactions needed by
the overwrites, and the speed up would be quite expecgted.

It might be that in the overwrite case, especially if you are using
128 byte inodes such that the mtime timestamp has only one second
granularity, that simply there isn't a need to do many journal
transactions.

So you might want to try a workload #3, where the fsync(2) is replaced
by fdatasync(2), and measure the wall clock time and get the jbd2 info
information as well.

Cheers,

					- Ted