lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203062523.3869120-1-yi.zhang@huawei.com>
Date: Tue,  3 Feb 2026 14:25:00 +0800
From: Zhang Yi <yi.zhang@...wei.com>
To: linux-ext4@...r.kernel.org
Cc: linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	tytso@....edu,
	adilger.kernel@...ger.ca,
	jack@...e.cz,
	ojaswin@...ux.ibm.com,
	ritesh.list@...il.com,
	hch@...radead.org,
	djwong@...nel.org,
	yi.zhang@...wei.com,
	yi.zhang@...weicloud.com,
	yizhang089@...il.com,
	libaokun1@...wei.com,
	yangerkun@...wei.com,
	yukuai@...as.com
Subject: [PATCH -next v2 00/22]  ext4: use iomap for regular file's buffered I/O path

From: Zhang Yi <yi.zhang@...weicloud.com>

Changes since V1:
 - Rebase this series on linux-next 20260122.
 - Refactor partial block zero range, stop passing handle to
   ext4_block_truncate_page() and ext4_zero_partial_blocks(), and move
   partial block zeroing operation outside an active journal transaction
   to prevent potential deadlocks because of the lock ordering of folio
   and transaction start.
 - Clarify the lock ordering of folio lock and transaction start, update
   the comments accordingly.
 - Fix some issues related to fast commit, pollute post-EOF folio.
 - Some minor code and comments optimizations.

v1:     https://lore.kernel.org/linux-ext4/20241022111059.2566137-1-yi.zhang@huaweicloud.com/
RFC v4: https://lore.kernel.org/linux-ext4/20240410142948.2817554-1-yi.zhang@huaweicloud.com/
RFC v3: https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
RFC v2: https://lore.kernel.org/linux-ext4/20240102123918.799062-1-yi.zhang@huaweicloud.com/
RFC v1: https://lore.kernel.org/linux-ext4/20231123125121.4064694-1-yi.zhang@huaweicloud.com/

Original Cover (Updated):

This series adds the iomap buffered I/O path supports for regular files.
It implements the core iomap APIs on ext4 and introduces two mount
options called 'buffered_iomap' and "nobuffered_iomap" to enable and
disable the iomap buffered I/O path. This series supports the default
features, default mount options and bigalloc feature for ext4. We do not
yet support online defragmentation, inline data, fs_verify, fs_crypt,
non-extent, and data=journal mode, it will fall to buffered_head I/O
path automatically if these features and options are used.

Key notes on the iomap implementations in this series.
 - Don't use ordered data mode to prevent exposing stale data when
   performing append write and truncating down.
 - Override dioread_nolock mount option, always allocate unwritten
   extents for new blocks.
 - When performing write back, don't use reserved journal handle and
   postponing updating i_disksize until I/O is done.
 - The lock ordering of the folio lock and start transaction is the
   opposite of that in the buffer_head buffered write path.

Series details:

Patch 01-08: Refactor partial block zeroing operation, move it out of an
             active running journal transaction, and handle post EOF
             partial block zeroing properly.
Patch 09-21: Implement the core iomap buffered read, write path, dirty
             folio write back path, mmap path and partial block zeroing
             path for ext4 regular file. 
Patch 22:    Introduce 'buffered_iomap' and 'nobuffer_iomap' mount option
             to enable and disable the iomap buffered I/O path.

Tests and Performance:

I tested this series using xfstests-bld with auto configurations, as
well as fast_commit and 64k configurations. No new regressions were
observed.

I used fio to test my virtual machine with a 150 GB memory disk and
found an improvement of approximately 30% to 50% in large I/O write
performance, while read performance showed no significant difference.

 buffered write
 ==============

  buffer_head:
  bs      write cache    uncached write
  1k       423  MiB/s      36.3 MiB/s
  4k       1067 MiB/s      58.4 MiB/s
  64k      4321 MiB/s      869  MiB/s
  1M       4640 MiB/s      3158 MiB/s
  
  iomap:
  bs      write cache    uncached write
  1k       403  MiB/s      57   MiB/s
  4k       1093 MiB/s      61   MiB/s
  64k      6488 MiB/s      1206 MiB/s
  1M       7378 MiB/s      4818 MiB/s

 buffered read
 =============

  buffer_head:
  bs      read hole   read cache      read data
  1k       635  MiB/s    661  MiB/s    605  MiB/s
  4k       1987 MiB/s    2128 MiB/s    1761 MiB/s
  64k      6068 MiB/s    9472 MiB/s    4475 MiB/s
  1M       5471 MiB/s    8657 MiB/s    4405 MiB/s

  iomap:
  bs      read hole   read cache       read data
  1k       643  MiB/s    653  MiB/s    602  MiB/s
  4k       2075 MiB/s    2159 MiB/s    1716 MiB/s
  64k      6267 MiB/s    9545MiB/s     4451 MiB/s
  1M       6072 MiB/s    9191MiB/s     4467 MiB/s

Comments and suggestions are welcome!

Thanks,
Yi.


Zhang Yi (22):
  ext4: make ext4_block_zero_page_range() pass out did_zero
  ext4: make ext4_block_truncate_page() return zeroed length
  ext4: only order data when partially block truncating down
  ext4: factor out journalled block zeroing range
  ext4: stop passing handle to ext4_journalled_block_zero_range()
  ext4: don't zero partial block under an active handle when truncating
    down
  ext4: move ext4_block_zero_page_range() out of an active handle
  ext4: zero post EOF partial block before appending write
  ext4: add a new iomap aops for regular file's buffered IO path
  ext4: implement buffered read iomap path
  ext4: pass out extent seq counter when mapping da blocks
  ext4: implement buffered write iomap path
  ext4: implement writeback iomap path
  ext4: implement mmap iomap path
  iomap: correct the range of a partial dirty clear
  iomap: support invalidating partial folios
  ext4: implement partial block zero range iomap path
  ext4: do not order data for inodes using buffered iomap path
  ext4: add block mapping tracepoints for iomap buffered I/O path
  ext4: disable online defrag when inode using iomap buffered I/O path
  ext4: partially enable iomap for the buffered I/O path of regular
    files
  ext4: introduce a mount option for iomap buffered I/O path

 fs/ext4/ext4.h              |  21 +-
 fs/ext4/ext4_jbd2.c         |   1 +
 fs/ext4/ext4_jbd2.h         |   7 +-
 fs/ext4/extents.c           |  31 +-
 fs/ext4/file.c              |  40 +-
 fs/ext4/ialloc.c            |   1 +
 fs/ext4/inode.c             | 822 ++++++++++++++++++++++++++++++++----
 fs/ext4/move_extent.c       |  11 +
 fs/ext4/page-io.c           | 119 ++++++
 fs/ext4/super.c             |  32 +-
 fs/iomap/buffered-io.c      |  12 +-
 include/trace/events/ext4.h |  45 ++
 12 files changed, 1033 insertions(+), 109 deletions(-)

-- 
2.52.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ