[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203062523.3869120-1-yi.zhang@huawei.com>
Date: Tue, 3 Feb 2026 14:25:00 +0800
From: Zhang Yi <yi.zhang@...wei.com>
To: linux-ext4@...r.kernel.org
Cc: linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
tytso@....edu,
adilger.kernel@...ger.ca,
jack@...e.cz,
ojaswin@...ux.ibm.com,
ritesh.list@...il.com,
hch@...radead.org,
djwong@...nel.org,
yi.zhang@...wei.com,
yi.zhang@...weicloud.com,
yizhang089@...il.com,
libaokun1@...wei.com,
yangerkun@...wei.com,
yukuai@...as.com
Subject: [PATCH -next v2 00/22] ext4: use iomap for regular file's buffered I/O path
From: Zhang Yi <yi.zhang@...weicloud.com>
Changes since V1:
- Rebase this series on linux-next 20260122.
- Refactor partial block zero range, stop passing handle to
ext4_block_truncate_page() and ext4_zero_partial_blocks(), and move
partial block zeroing operation outside an active journal transaction
to prevent potential deadlocks because of the lock ordering of folio
and transaction start.
- Clarify the lock ordering of folio lock and transaction start, update
the comments accordingly.
- Fix some issues related to fast commit, pollute post-EOF folio.
- Some minor code and comments optimizations.
v1: https://lore.kernel.org/linux-ext4/20241022111059.2566137-1-yi.zhang@huaweicloud.com/
RFC v4: https://lore.kernel.org/linux-ext4/20240410142948.2817554-1-yi.zhang@huaweicloud.com/
RFC v3: https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
RFC v2: https://lore.kernel.org/linux-ext4/20240102123918.799062-1-yi.zhang@huaweicloud.com/
RFC v1: https://lore.kernel.org/linux-ext4/20231123125121.4064694-1-yi.zhang@huaweicloud.com/
Original Cover (Updated):
This series adds the iomap buffered I/O path supports for regular files.
It implements the core iomap APIs on ext4 and introduces two mount
options called 'buffered_iomap' and "nobuffered_iomap" to enable and
disable the iomap buffered I/O path. This series supports the default
features, default mount options and bigalloc feature for ext4. We do not
yet support online defragmentation, inline data, fs_verify, fs_crypt,
non-extent, and data=journal mode, it will fall to buffered_head I/O
path automatically if these features and options are used.
Key notes on the iomap implementations in this series.
- Don't use ordered data mode to prevent exposing stale data when
performing append write and truncating down.
- Override dioread_nolock mount option, always allocate unwritten
extents for new blocks.
- When performing write back, don't use reserved journal handle and
postponing updating i_disksize until I/O is done.
- The lock ordering of the folio lock and start transaction is the
opposite of that in the buffer_head buffered write path.
Series details:
Patch 01-08: Refactor partial block zeroing operation, move it out of an
active running journal transaction, and handle post EOF
partial block zeroing properly.
Patch 09-21: Implement the core iomap buffered read, write path, dirty
folio write back path, mmap path and partial block zeroing
path for ext4 regular file.
Patch 22: Introduce 'buffered_iomap' and 'nobuffer_iomap' mount option
to enable and disable the iomap buffered I/O path.
Tests and Performance:
I tested this series using xfstests-bld with auto configurations, as
well as fast_commit and 64k configurations. No new regressions were
observed.
I used fio to test my virtual machine with a 150 GB memory disk and
found an improvement of approximately 30% to 50% in large I/O write
performance, while read performance showed no significant difference.
buffered write
==============
buffer_head:
bs write cache uncached write
1k 423 MiB/s 36.3 MiB/s
4k 1067 MiB/s 58.4 MiB/s
64k 4321 MiB/s 869 MiB/s
1M 4640 MiB/s 3158 MiB/s
iomap:
bs write cache uncached write
1k 403 MiB/s 57 MiB/s
4k 1093 MiB/s 61 MiB/s
64k 6488 MiB/s 1206 MiB/s
1M 7378 MiB/s 4818 MiB/s
buffered read
=============
buffer_head:
bs read hole read cache read data
1k 635 MiB/s 661 MiB/s 605 MiB/s
4k 1987 MiB/s 2128 MiB/s 1761 MiB/s
64k 6068 MiB/s 9472 MiB/s 4475 MiB/s
1M 5471 MiB/s 8657 MiB/s 4405 MiB/s
iomap:
bs read hole read cache read data
1k 643 MiB/s 653 MiB/s 602 MiB/s
4k 2075 MiB/s 2159 MiB/s 1716 MiB/s
64k 6267 MiB/s 9545MiB/s 4451 MiB/s
1M 6072 MiB/s 9191MiB/s 4467 MiB/s
Comments and suggestions are welcome!
Thanks,
Yi.
Zhang Yi (22):
ext4: make ext4_block_zero_page_range() pass out did_zero
ext4: make ext4_block_truncate_page() return zeroed length
ext4: only order data when partially block truncating down
ext4: factor out journalled block zeroing range
ext4: stop passing handle to ext4_journalled_block_zero_range()
ext4: don't zero partial block under an active handle when truncating
down
ext4: move ext4_block_zero_page_range() out of an active handle
ext4: zero post EOF partial block before appending write
ext4: add a new iomap aops for regular file's buffered IO path
ext4: implement buffered read iomap path
ext4: pass out extent seq counter when mapping da blocks
ext4: implement buffered write iomap path
ext4: implement writeback iomap path
ext4: implement mmap iomap path
iomap: correct the range of a partial dirty clear
iomap: support invalidating partial folios
ext4: implement partial block zero range iomap path
ext4: do not order data for inodes using buffered iomap path
ext4: add block mapping tracepoints for iomap buffered I/O path
ext4: disable online defrag when inode using iomap buffered I/O path
ext4: partially enable iomap for the buffered I/O path of regular
files
ext4: introduce a mount option for iomap buffered I/O path
fs/ext4/ext4.h | 21 +-
fs/ext4/ext4_jbd2.c | 1 +
fs/ext4/ext4_jbd2.h | 7 +-
fs/ext4/extents.c | 31 +-
fs/ext4/file.c | 40 +-
fs/ext4/ialloc.c | 1 +
fs/ext4/inode.c | 822 ++++++++++++++++++++++++++++++++----
fs/ext4/move_extent.c | 11 +
fs/ext4/page-io.c | 119 ++++++
fs/ext4/super.c | 32 +-
fs/iomap/buffered-io.c | 12 +-
include/trace/events/ext4.h | 45 ++
12 files changed, 1033 insertions(+), 109 deletions(-)
--
2.52.0
Powered by blists - more mailing lists