[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YMzWE5sJeuIeOv1q@mit.edu>
Date: Fri, 18 Jun 2021 13:21:23 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Gabriel Krisman Bertazi <krisman@...labora.com>
Cc: kernel@...labora.com, linux-ext4@...r.kernel.org
Subject: Re: Potential regression with iomap DIO for 4k writes
On Tue, Jun 15, 2021 at 08:17:55PM -0400, Gabriel Krisman Bertazi wrote:
> Gabriel Krisman Bertazi <krisman@...labora.com> writes:
>
> > While I've been exploring the performance of different DIO
> > implementations, I've come across what seems a noticeable regression
> > (22% slowdown) in 4k writes in Ext4 when comparing the original
> > direct-io with the current iomap dio implementation, as existing on
> > linus/master. Perhaps you already know about this, but I'm having a
> > hard time understanding the root cause, in order to attempt to improve
> > the situation.
>
> Sorry for the ping, but do you have any ideas of what we are seeing
> here?
Apologies for the delay in responding; somehow I missed your initial
e-mail on the subject on June 2nd, although I haven't found it in the
mailing list archives[1]. I don't know if it got caught in a spam
trap, or was accidentally deleted from my inbox.
[1] https://lore.kernel.org/linux-ext4/87lf7rkffv.fsf@collabora.com/
I didn't do any bs=4k benchmarks before we landed the DIO iomap
changes, and it's interesting that it largely goes away with a 16k
block size[2]
[2] https://people.collabora.com/~krisman/dio/week21/bench.png
Looking at your flame graphs[3][4]
[3] https://people.collabora.com/~krisman/dio/week23/clean_flames/5.4.0-dio_original-dio-ext4-write-4k.svg
[4] https://people.collabora.com/~krisman/dio/week23/clean_flames/5.5.0-dio_old-iomap-ext4-write-4k.svg
... nothing immediately jumps out at me.
Have you compared the output of /proc/lock_stat for the two kernels?
And is there anything obvious in the blktrace of the two kernels?
Cheers,
- Ted
> > * Benchmark
> >
> > For starter, I'm comparing three kernels, built with same config and
> > compiler (gcc-8.4.0 (locally built)). My DUT is a Samsung SSD 970 EVO
> > Plus 250GB dedicated to this test (no concurrent IO).
> >
> > - Kernel 1: Commit immediately before iomap for ext4 is merged
> > ("f112a2fd1f59"). On the data below, this kernel is identified as
> > 5.4.0-original-dio. Available in a public branch at:
> >
> > <gitlab.collabora.com:krisman/linux.git -b dio/original-dio>
> >
> > - Kernel 2: tag 5.5 (first release with dio-iomap). In the data
> > below, identified as 5.5.0-old-iomap. For completeness, it is
> > available at:
> >
> > <gitlab.collabora.com:krisman/linux.git -b dio/old-dio>
> >
> > - Kernel 3: Kernel tag 5.13-rc3. In the data below, identified as
> > 5.13-rc3-iomap. For completeness, it is available at:
> >
> > <gitlab.collabora.com:krisman/linux.git -b dio/iomap>
> >
> > I ran the fio job below with the combinations: BS=4k,16k and RW=read,write
> >
> > fio --ioengine libaio --size=2G --direct=1 --iodepth=64 --time_based=1 \
> > --thread=1 --overwrite=1 --runtime=100 --output-format=terse
> >
> > For every kernel test, the file system was recreated, and the 2GB file
> > was pre-allocated.
> >
> > In an attempt to further isolate the problem, I tested both xfs and ext4
> > in the same condition.
> >
> > The script I used is available at:
> >
> > <https://people.collabora.com/~krisman/dio/bench.sh>
> >
> > * Results
> >
> > I obtained the following performance results, relative to the baseline
> > 5.4.0-original-dio.
> >
> > | IOPS |
> > | kernel | read-4k | read-16k | write-4k | write-16k |
> > |------------------------+------------------+-------------------+-------------------+-------------------+
> > | 5.13.0-rc3-iomap-ext4 | 1.01192950082305 | 1.00026413252562 | 0.806377013901006 | 1.00020735846057 |
> > | 5.5.0-old-iomap-ext4 | 1.01154156662508 | 0.998753983520427 | 0.777051125458035 | 0.999937792461829 |
> > | 5.13.0-rc3-iomap-xfs | 1.00234888443008 | 1.00027645151444 | 1.00996172750095 | 1.00156349447934 |
> > | 5.5.0-old-iomap-xfs | 1.00010412786902 | 1.00202731110586 | 1.01502846821264 | 1.00149431330769 |
> >
> >
> > Total IO is the amount of data copied (relative to baseline).
> >
> > | TOTAL_IO
> > | kernel | read-4k | read-16k | write-4k | write-16k |
> > |------------------------+------------------+-------------------+-------------------+-------------------|
> > | 5.13.0-rc3-iomap-ext4 | 1.01193023173156 | 1.00026332569559 | 0.806377530301477 | 1.00014686835205 |
> > | 5.5.0-old-iomap-ext4 | 1.01154196621591 | 0.998758131673757 | 0.777050753425118 | 0.999902824986834 |
> > | 5.13.0-rc3-iomap-xfs | 1.00234893734134 | 1.00027535318322 | 1.00996437458991 | 1.00156305646789 |
> > | 5.5.0-old-iomap-xfs | 1.00010328564078 | 1.00202831801018 | 1.01503060595258 | 1.00149069402364 |
> >
> > With a visualization of the above data here:
> >
> > <https://people.collabora.com/~krisman/dio/bench.png>
> >
> > The only out of the ordinary result seems to be in write-4k for Ext4,
> > which suggests around 20% less IOPS (and total IO) for iomap in
> > comparison to the original DIO. This is not a one-off run, as it seems
> > to be consistently reproducible with more test runs in my environment.
> > The performance reduction also doesn't reproduce on XFS.
> >
> > I tried to limit the influence of other parts of the kernel that could
> > affect the behavior by comparing the kernel immediately before the
> > introduction of dio-iomap for ext4 with the first version with that
> > feature. By also observing that xfs doesn't change, I believe it to be
> > ext4 specific.
> >
> > I'm also publishing raw data and all related material to the link below,
> > in case anyone wants to tinker with my data:
> >
> > https://people.collabora.com/~krisman/dio/
> >
> > Perhaps I'm missing something obvious. But I can't pinpoint a specific
> > problem with my analysis. Is this expected, given the way ext4 iomap
> > work? Do you have any idea of the root cause or how it can be improved?
> >
> > I will keep looking to this issue, but I'd like to share this partial
> > result, in case there is a problem with my analysis, or if you have any
> > suggestion.
> >
> > Thanks,
>
> --
> Gabriel Krisman Bertazi
Powered by blists - more mailing lists