lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1762945505.git.ojaswin@linux.ibm.com>
Date: Wed, 12 Nov 2025 16:36:03 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Christian Brauner <brauner@...nel.org>, djwong@...nel.org,
        ritesh.list@...il.com, john.g.garry@...cle.com, tytso@....edu,
        willy@...radead.org, dchinner@...hat.com, hch@....de
Cc: linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, jack@...e.cz, nilay@...ux.ibm.com,
        martin.petersen@...cle.com, rostedt@...dmis.org, axboe@...nel.dk,
        linux-block@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: [RFC PATCH 0/8] xfs: single block atomic writes for buffered IO

This patch adds support to perform single block RWF_ATOMIC writes for
iomap xfs buffered IO. This builds upon the inital RFC shared by John
Garry last year [1]. Most of the details are present in the respective 
commit messages but I'd mention some of the design points below:

1. The first 4 patches introduce the statx and iomap plubming and page
flags to add basic atomic writes support to buffered IO. However, there
are still 2 key restrictions that apply:

FIRST: If the user buffer of atomic write crosses page boundary, there's a
possibility of short write, example if 1 user page could not be faulted or got
reclaimed before the copy operation. For now don't allow such a scenario by
ensuring user buffer is page aligned. This way either the full write goes
through or nothing does. This is also discussed in Mathew Wilcox's comment here
[2]

This is lifted in patch 5. The approach we took was to:
 1. pin the user pages
 2. Create a BVEC out of the struct page to pass to
    copy_folio_from_iter_atomic() rather than the USER backed iter. We
    don't use the user iter directly because the pinned user page could
    still get unmapped from the process, leading to short writes.

This approach allows us to only proceed if we are sure we will not have a short
copy.

SECOND: We only support block size == page size buf-io atomic writes.
This is to avoid the following scenario:
 1. 4kb block atomic write marks the complete 64kb folio as
    atomic.
 2. Other writes, dirty the whole 64kb folio.
 3. Writeback sees the whole folio dirty and atomic and tries
    to send a 64kb atomic write, which might exceed the
    allowed atomic write size and fail.

Patch 7 adds support for sub-page atomic write tracking to remove this
restriction.  We do this by adding 2 more bitmaps to ifs to track atomic
write start and end.

Lastly, a non atomic write over an atomic write will remove the atomic
guarantee. Userspace is expected to make sure to sync the data to disk
after an atomic write before performing any overwrites.

This series has survived -g quick xfstests and I'll be continuing to
test it.  Just wanted to put out the RFC to get some reviews on the
design and suggestions on any better approaches.

[1] https://lore.kernel.org/all/20240422143923.3927601-1-john.g.garry@oracle.com/
[2] https://lore.kernel.org/all/ZiZ8XGZz46D3PRKr@casper.infradead.org/

Thanks,
Ojaswin

John Garry (2):
  fs: Rename STATX{_ATTR}_WRITE_ATOMIC -> STATX{_ATTR}_WRITE_ATOMIC_DIO
  mm: Add PG_atomic

Ojaswin Mujoo (6):
  fs: Add initial buffered atomic write support info to statx
  iomap: buffered atomic write support
  iomap: pin pages for RWF_ATOMIC buffered write
  xfs: Report atomic write min and max for buf io as well
  iomap: Add bs<ps buffered atomic writes support
  xfs: Lift the bs == ps restriction for HW buffered atomic writes

 .../filesystems/ext4/atomic_writes.rst        |   4 +-
 block/bdev.c                                  |   7 +-
 fs/ext4/inode.c                               |   9 +-
 fs/iomap/buffered-io.c                        | 395 ++++++++++++++++--
 fs/iomap/ioend.c                              |  21 +-
 fs/iomap/trace.h                              |  12 +-
 fs/read_write.c                               |   3 -
 fs/stat.c                                     |  33 +-
 fs/xfs/xfs_file.c                             |   9 +-
 fs/xfs/xfs_iops.c                             | 127 +++---
 fs/xfs/xfs_iops.h                             |   6 +-
 include/linux/fs.h                            |   3 +-
 include/linux/iomap.h                         |   3 +
 include/linux/page-flags.h                    |   5 +
 include/trace/events/mmflags.h                |   3 +-
 include/trace/misc/fs.h                       |   3 +-
 include/uapi/linux/stat.h                     |  10 +-
 tools/include/uapi/linux/stat.h               |  10 +-
 .../trace/beauty/include/uapi/linux/stat.h    |  10 +-
 19 files changed, 551 insertions(+), 122 deletions(-)

-- 
2.51.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ