linux-ext4 - Re: improved performance in case of data journaling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKJOkCrBMhLKZjp4=1KJv3uY+xFBN0KEjDx_ix=88xr0oegD+w@mail.gmail.com>
Date:   Thu, 3 Dec 2020 01:07:51 -0800
From:   lokesh jaliminche <lokesh.jaliminche@...il.com>
To:     Martin Steigerwald <martin@...htvoll.de>
Cc:     Ext4 <linux-ext4@...r.kernel.org>,
        Andrew Morton <akpm@...uxfoundation.org>
Subject: Re: improved performance in case of data journaling

Hi Martin,

thanks for the quick response,

Apologies from my side, I should have posted my fio job description
with the fio logs
Anyway here is my fio workload.

[global]
filename=/mnt/ext4/test
direct=1
runtime=30s
time_based
size=100G
group_reporting

[writer]
new_group
rate_iops=250000
bs=4k
iodepth=1
ioengine=sync
rw=randomwrite
numjobs=1

I am using Intel Optane SSD so it's certainly very fast.

I agree that delayed logging could help to hide the performance
degradation due to actual writes to SSD. However as per the iostat
output data is definitely crossing the block layer and since
data journaling logs both data and metadata I am wondering why
or how IO requests see reduced latencies compared to metadata
journaling or even no journaling.

Also, I am using direct IO mode so ideally, it should not be using any type
of caching. I am not sure if it's applicable to journal writes but the whole
point of journaling is to prevent data loss in case of abrupt failures. So
caching journal writes may result in data loss unless we are using NVRAM.

So questions come to my mind are
1. why writes without journaling are having long latencies as compared to
    writes requests with metadata and data journaling?
2. Since metadata journaling have relatively fewer journal writes than data
    journaling why writes with data journaling is faster than no journaling and
    metadata journaling mode?
3. If there is an optimization that allows data journaling to be so fast without
   any risk of data loss, why the same optimization is not used in
case of metadata
   journaling?

On Thu, Dec 3, 2020 at 12:20 AM Martin Steigerwald <martin@...htvoll.de> wrote:
>
> lokesh jaliminche - 03.12.20, 08:28:49 CET:
> > I have been doing experiments to analyze the impact of data journaling
> > on IO latencies. Theoretically, data journaling should show long
> > latencies as compared to metadata journaling. However, I observed
> > that when I enable data journaling I see improved performance. Is
> > there any specific optimization for data journaling in the write
> > path?
>
> This has been discussed before as Andrew Morton found that data
> journalling would be surprisingly fast with interactive write workloads.
> I would need to look it up in my performance training slides or use
> internet search to find the reference to that discussion again.
>
> AFAIR even Andrew had no explanation for that. So I thought why would I
> have one? However an idea came to my mind: The journal is a sequential
> area on the disk. This could help with harddisks I thought at least if
> if it I/O mostly to the same not too big location/file – as you did not
> post it, I don't know exactly what your fio job file is doing. However the
> latencies you posted as well as the device name certainly point to fast
> flash storage :).
>
> Another idea that just came to my mind is: AFAIK ext4 uses quite some
> delayed logging and relogging. That means if a block in the journal is
> changed another time within a certain time frame Ext4 changes it in
> memory before the journal block is written out to disk. Thus if the same
> block if overwritten again and again in short time, at least some of the
> updates would only happen in RAM. That might help latencies even with
> NVMe flash as RAM usually still is faster.
>
> Of course I bet that Ext4 maintainers have a more accurate or detailed
> explanation than I do. But that was at least my idea about this.
>
> Best,
> --
> Martin
>
>