lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87msbcwsjp.fsf@gmail.com>
Date: Fri, 16 May 2025 19:45:22 +0530
From: Ritesh Harjani (IBM) <ritesh.list@...il.com>
To: John Garry <john.g.garry@...cle.com>, linux-ext4@...r.kernel.org
Cc: Theodore Ts'o <tytso@....edu>, Jan Kara <jack@...e.cz>, djwong@...nel.org, Ojaswin Mujoo <ojaswin@...ux.ibm.com>, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v5 7/7] ext4: Add atomic block write documentation

John Garry <john.g.garry@...cle.com> writes:

> On 15/05/2025 20:50, Ritesh Harjani (IBM) wrote:
>
> thanks for adding this info
>
>> Application Interface
>
> Should we put this into a common file, as it is just not relevant to ext4?
>
> Or move this file to a common location, and have separate sections for 
> ext4 and xfs? This would save having scattered files for instructions.
>

The purpose of adding this documentation was mainly to note down some of
the implementation details around multi-fsblock atomic writes for ext4
using bigalloc which otherwise are easy to miss. But since there was no
general documentation available on atomic writes, we added a bit more
info around it mainly enough to cover ext4.

>> +~~~~~~~~~~~~~~~~~~~~~
>> +
>> +Applications can use the ``pwritev2()`` system call with the ``RWF_ATOMIC`` flag
>> +to perform atomic writes:
>> +
>> +.. code-block:: c
>> +
>> +    pwritev2(fd, iov, iovcnt, offset, RWF_ATOMIC);
>> +
>> +The write must be aligned to the filesystem's block size and not exceed the
>> +filesystem's maximum atomic write unit size.
>> +See ``generic_atomic_write_valid()`` for more details.
>> +
>> +``statx()`` system call with ``STATX_WRITE_ATOMIC`` flag can provides following
>> +details:
>> +
>> + * ``stx_atomic_write_unit_min``: Minimum size of an atomic write request.
>> + * ``stx_atomic_write_unit_max``: Maximum size of an atomic write request.
>> + * ``stx_atomic_write_segments_max``: Upper limit for segments. The number of
>> +   separate memory buffers that can be gathered into a write operation
>
> there will also be stx_atomic_write_unit_max_opt, as queued for 6.16
>
> For HW-only support, I think that it is ok to just return same as 
> stx_atomic_write_unit_max when we can atomic write > 1 filesystem block
>

Yes, so for HW-only support like ext4 it may not be strictly required.
To avoid the dependency on XFS patch series, I think it will be better if we add
those changes after XFS multi-fsblock atomic write has landed :)


>> +   (e.g., the iovcnt parameter for IOV_ITER).
>
>
>> Currently, this is always set to one.
>
> JFYI, for xfs supporting filesystem-based atomic writes only, i.e. no HW 
> support, we could set this to a higher value
>

Yes. But again, XFS specific detail, not strictly relevant for EXT4 atomic write documentation.

>> +
>> +The STATX_ATTR_WRITE_ATOMIC flag in ``statx->attributes`` is set if atomic
>> +writes are supported.
>> +
>> +.. _atomic_write_bdev_support:
>> +
>> +Hardware Support
>> +----------------
>> +
>> +The underlying storage device must support atomic write operations.
>> +Modern NVMe and SCSI devices often provide this capability.
>> +The Linux kernel exposes this information through sysfs:
>> +
>> +* ``/sys/block/<device>/queue/atomic_write_unit_min`` - Minimum atomic write size
>> +* ``/sys/block/<device>/queue/atomic_write_unit_max`` - Maximum atomic write size
>
> there is also the max bytes and boundary files. I am not sure if it was 
> intentional to omit them.
>

The intention of this section was mainly for sysadmin to first check if
the underlying block device supports atomic writes and what are it's awu
units to decide an appropriate blocksize and/or clustersize for ext4
filesystem.

See section "Creating Filesystems with Atomic Write Support"  which
refers to this section first.

>> +
>> +Nonzero values for these attributes indicate that the device supports
>> +atomic writes.
>> +
>> +See Also
>
> thanks,
> John

Thanks for the review John. 

I think the current documentation mainly caters to ext4 specific
implementation notes on single and multi-fsblock atomic writes.

IMO, it is ok for us to keep this Documentation as is for v6.16 and
let's work on a more general doc which can cover details like:
- block device driver support (scsi & nvme)
- block layer support (bio split & merge )
- Filesystem & iomap support (iomap, ext4, xfs)
- VFS layer support (statx, pwritev2...)

We can add these documentations in their respective subsystem
directories and add a more common Documentation where VFS details are
kept, which will refer to these subsystem specific details.

Thoughts?

-ritesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ