[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230503221749.GF3223426@dread.disaster.area>
Date: Thu, 4 May 2023 08:17:49 +1000
From: Dave Chinner <david@...morbit.com>
To: John Garry <john.g.garry@...cle.com>
Cc: axboe@...nel.dk, kbusch@...nel.org, hch@....de, sagi@...mberg.me,
martin.petersen@...cle.com, djwong@...nel.org,
viro@...iv.linux.org.uk, brauner@...nel.org, dchinner@...hat.com,
jejb@...ux.ibm.com, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-scsi@...r.kernel.org, linux-xfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org,
linux-security-module@...r.kernel.org, paul@...l-moore.com,
jmorris@...ei.org, serge@...lyn.com
Subject: Re: [PATCH RFC 03/16] xfs: Support atomic write for statx
On Wed, May 03, 2023 at 06:38:08PM +0000, John Garry wrote:
> Support providing info on atomic write unit min and max.
>
> Darrick Wong originally authored this change.
>
> Signed-off-by: John Garry <john.g.garry@...cle.com>
> ---
> fs/xfs/xfs_iops.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 24718adb3c16..e542077704aa 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -614,6 +614,16 @@ xfs_vn_getattr(
> stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
> stat->dio_offset_align = bdev_logical_block_size(bdev);
> }
> + if (request_mask & STATX_WRITE_ATOMIC) {
> + struct xfs_buftarg *target = xfs_inode_buftarg(ip);
> + struct block_device *bdev = target->bt_bdev;
> +
> + stat->atomic_write_unit_min = queue_atomic_write_unit_min(bdev->bd_queue);
> + stat->atomic_write_unit_max = queue_atomic_write_unit_max(bdev->bd_queue);
I'm not sure this is right.
Given that we may have a 4kB physical sector device, XFS will not
allow IOs smaller than physical sector size. The initial values of
queue_atomic_write_unit_min/max() will be (1 << SECTOR_SIZE) which
is 512 bytes. IOs done with 4kB sector size devices will fail in
this case.
Further, XFS has a software sector size - it can define the sector
size for the filesystem to be 4KB on a 512 byte sector device. And
in that case, the filesystem will reject 512 byte sized/aligned IOs
as they are smaller than the filesystem sector size (i.e. a config
that prevents sub-physical sector IO for 512 logical/4kB physical
devices).
There may other filesystem constraints - realtime devices have fixed
minimum allocation sizes which may be larger than atomic write
limits, which means that IO completion needs to split extents into
multiple unwritten/written extents, extent size hints might be in
use meaning we have different allocation alignment constraints to
atomic write constraints, stripe alignment of extent allocation may
through out atomic write alignment, etc.
These are all solvable, but we need to make sure here that the
filesystem constraints are taken into account here, not just the
block device limits.
As such, it is probably better to query these limits at filesystem
mount time and add them to the xfs buftarg (same as we do for
logical and physical sector sizes) and then use the xfs buftarg
values rather than having to go all the way to the device queue
here. That way we can ensure at mount time that atomic write limits
don't conflict with logical/physical IO limits, and we can further
constrain atomic limits during mount without always having to
recalculate those limits from first principles on every stat()
call...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists