[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d9585df-9a1c-42f7-99ca-084dd47ea3ae@oracle.com>
Date: Mon, 17 Mar 2025 09:36:13 +0000
From: John Garry <john.g.garry@...cle.com>
To: Christoph Hellwig <hch@....de>
Cc: brauner@...nel.org, djwong@...nel.org, cem@...nel.org, dchinner@...hat.com,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com,
ritesh.list@...il.com, martin.petersen@...cle.com, tytso@....edu,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 11/13] xfs: add xfs_file_dio_write_atomic()
On 17/03/2025 06:41, Christoph Hellwig wrote:
> On Thu, Mar 13, 2025 at 05:13:08PM +0000, John Garry wrote:
>> + * REQ_ATOMIC-based is the preferred method, and is attempted first. If this
>> + * method fails due to REQ_ATOMIC-related constraints, then we retry with the
>> + * COW-based method. The REQ_ATOMIC-based method typically will fail if the
>> + * write spans multiple extents or the disk blocks are misaligned.
>
> It is only preferred if actually supported by the underlying hardware.
> If it isn't it really shouldn't even be tried, as that is just a waste
> of cycles.
We should not even call this function if atomics are not supported by HW
- please see IOCB_ATOMIC checks in xfs_file_write_iter(). So maybe I
will mention that the caller must ensure atomics are supported for the
write size.
>
> Also a lot of comment should probably be near the code not on top
> of the function as that's where people would look for them.
sure, if you prefer
>
>> +static noinline ssize_t
>> +xfs_file_dio_write_atomic(
>> + struct xfs_inode *ip,
>> + struct kiocb *iocb,
>> + struct iov_iter *from)
>> +{
>> + unsigned int iolock = XFS_IOLOCK_SHARED;
>> + unsigned int dio_flags = 0;
>> + const struct iomap_ops *dops = &xfs_direct_write_iomap_ops;
>> + ssize_t ret;
>> +
>> +retry:
>> + ret = xfs_ilock_iocb_for_write(iocb, &iolock);
>> + if (ret)
>> + return ret;
>> +
>> + ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
>> + if (ret)
>> + goto out_unlock;
>> +
>> + if (dio_flags & IOMAP_DIO_FORCE_WAIT)
>> + inode_dio_wait(VFS_I(ip));
>> +
>> + trace_xfs_file_direct_write(iocb, from);
>> + ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
>> + dio_flags, NULL, 0);
>
> The normal direct I/O path downgrades the iolock to shared before
> doing the I/O here. Why isn't that done here?
OK, I can do that. But we still require exclusive lock always for the
CoW-based method.
>
>> + if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
>> + dops == &xfs_direct_write_iomap_ops) {
>
> This should probably explain the unusual use of EGAIN. Although I
> still feel that picking a different error code for the fallback would
> be much more maintainable.
I could try another error code - can you suggest one? Is it going to be
something unrelated to storage stack, like EREMOTEIO?
>
>> + xfs_iunlock(ip, iolock);
>> + dio_flags = IOMAP_DIO_FORCE_WAIT;
>
> I notice the top of function comment mentions the IOMAP_DIO_FORCE_WAIT
> flag. Maybe use the chance to write a full sentence here or where
> it is checked to explain the logic a bit better?
ok, fine
>
>> * Handle block unaligned direct I/O writes
>> *
>> @@ -840,6 +909,10 @@ xfs_file_dio_write(
>> return xfs_file_dio_write_unaligned(ip, iocb, from);
>> if (xfs_is_zoned_inode(ip))
>> return xfs_file_dio_write_zoned(ip, iocb, from);
>> +
>> + if (iocb->ki_flags & IOCB_ATOMIC)
>> + return xfs_file_dio_write_atomic(ip, iocb, from);
>> +
>
> Either keep space between all the conditional calls or none. I doubt
> just stick to the existing style.
Sure
>
Powered by blists - more mailing lists