[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZcXNidyoaVJMFKYW@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com>
Date: Fri, 9 Feb 2024 12:30:56 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: John Garry <john.g.garry@...cle.com>
Cc: hch@....de, djwong@...nel.org, viro@...iv.linux.org.uk, brauner@...nel.org,
dchinner@...hat.com, jack@...e.cz, chandan.babu@...cle.com,
martin.petersen@...cle.com, linux-kernel@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tytso@....edu, jbongio@...gle.com
Subject: Re: [PATCH 4/6] fs: xfs: Support atomic write for statx
Hi John,
Thanks for the patch, I've added some review comments and questions
below.
On Wed, Jan 24, 2024 at 02:26:43PM +0000, John Garry wrote:
> Support providing info on atomic write unit min and max for an inode.
>
> For simplicity, currently we limit the min at the FS block size, but a
> lower limit could be supported in future.
>
> The atomic write unit min and max is limited by the guaranteed extent
> alignment for the inode.
>
> Signed-off-by: John Garry <john.g.garry@...cle.com>
> ---
> fs/xfs/xfs_iops.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> fs/xfs/xfs_iops.h | 4 ++++
> 2 files changed, 49 insertions(+)
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index a0d77f5f512e..0890d2f70f4d 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -546,6 +546,44 @@ xfs_stat_blksize(
> return PAGE_SIZE;
> }
>
> +void xfs_get_atomic_write_attr(
> + struct xfs_inode *ip,
> + unsigned int *unit_min,
> + unsigned int *unit_max)
> +{
> + xfs_extlen_t extsz = xfs_get_extsz(ip);
> + struct xfs_buftarg *target = xfs_inode_buftarg(ip);
> + struct block_device *bdev = target->bt_bdev;
> + unsigned int awu_min, awu_max, align;
> + struct request_queue *q = bdev->bd_queue;
> + struct xfs_mount *mp = ip->i_mount;
> +
> + /*
> + * Convert to multiples of the BLOCKSIZE (as we support a minimum
> + * atomic write unit of BLOCKSIZE).
> + */
> + awu_min = queue_atomic_write_unit_min_bytes(q);
> + awu_max = queue_atomic_write_unit_max_bytes(q);
> +
> + awu_min &= ~mp->m_blockmask;
> + awu_max &= ~mp->m_blockmask;
I don't understand why we try to round down the awu_max to blocks size
here and not just have an explicit check of (awu_max < blocksize).
I think the issue with changing the awu_max is that we are using awu_max
to also indirectly reflect the alignment so as to ensure we don't cross
atomic boundaries set by the hw (eg we check uint_max % atomic alignment
== 0 in scsi). So once we change the awu_max, there's a chance that even
if an atomic write aligns to the new awu_max it still doesn't have the
right alignment and fails.
It works right now since eveything is power of 2 but it should cause
issues incase we decide to remove that limitation. Anyways, I think
this implicit behavior of things working since eveything is a power of 2
should atleast be documented in a comment, so these things are
immediately clear.
> +
> + align = XFS_FSB_TO_B(mp, extsz);
> +
> + if (!awu_max || !xfs_inode_atomicwrites(ip) || !align ||
> + !is_power_of_2(align)) {
Correct me if I'm wrong but here as well, the is_power_of_2(align) is
esentially checking if the align % uinit_max == 0 (or vice versa if
unit_max is greater) so that an allocation of extsize will always align
nicely as needed by the device.
So maybe we should use the % expression explicitly so that the intention
is immediately clear.
> + *unit_min = 0;
> + *unit_max = 0;
> + } else {
> + if (awu_min)
> + *unit_min = min(awu_min, align);
How will the min() here work? If awu_min is the minumum set by the
device, how can statx be allowed to advertise something smaller than
that?
If I understand correctly, right now the way we set awu_min in scsi and
nvme, the follwoing should usually be true for a sane device:
awu_min <= blocks size of fs <= align
so the min() anyways becomes redundant, but if we do assume that there
might be some weird devices with awu_min absurdly large (SCSI with
high atomic granularity) we still can't actually advertise a min
smaller than that of the device, or am I missing something here?
> + else
> + *unit_min = mp->m_sb.sb_blocksize;
> +
> + *unit_max = min(awu_max, align);
> + }
> +}
> +
Regards,
ojaswin
Powered by blists - more mailing lists