lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250317064109.GA27621@lst.de>
Date: Mon, 17 Mar 2025 07:41:09 +0100
From: Christoph Hellwig <hch@....de>
To: John Garry <john.g.garry@...cle.com>
Cc: brauner@...nel.org, djwong@...nel.org, cem@...nel.org,
	dchinner@...hat.com, hch@....de, linux-xfs@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	ojaswin@...ux.ibm.com, ritesh.list@...il.com,
	martin.petersen@...cle.com, tytso@....edu,
	linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 11/13] xfs: add xfs_file_dio_write_atomic()

On Thu, Mar 13, 2025 at 05:13:08PM +0000, John Garry wrote:
> + * REQ_ATOMIC-based is the preferred method, and is attempted first. If this
> + * method fails due to REQ_ATOMIC-related constraints, then we retry with the
> + * COW-based method. The REQ_ATOMIC-based method typically will fail if the
> + * write spans multiple extents or the disk blocks are misaligned.

It is only preferred if actually supported by the underlying hardware.
If it isn't it really shouldn't even be tried, as that is just a waste
of cycles.

Also a lot of comment should probably be near the code not on top
of the function as that's where people would look for them.

> +static noinline ssize_t
> +xfs_file_dio_write_atomic(
> +	struct xfs_inode	*ip,
> +	struct kiocb		*iocb,
> +	struct iov_iter		*from)
> +{
> +	unsigned int		iolock = XFS_IOLOCK_SHARED;
> +	unsigned int		dio_flags = 0;
> +	const struct iomap_ops	*dops = &xfs_direct_write_iomap_ops;
> +	ssize_t			ret;
> +
> +retry:
> +	ret = xfs_ilock_iocb_for_write(iocb, &iolock);
> +	if (ret)
> +		return ret;
> +
> +	ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
> +	if (ret)
> +		goto out_unlock;
> +
> +	if (dio_flags & IOMAP_DIO_FORCE_WAIT)
> +		inode_dio_wait(VFS_I(ip));
> +
> +	trace_xfs_file_direct_write(iocb, from);
> +	ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
> +			dio_flags, NULL, 0);

The normal direct I/O path downgrades the iolock to shared before
doing the I/O here.  Why isn't that done here?

> +	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
> +	    dops == &xfs_direct_write_iomap_ops) {

This should probably explain the unusual use of EGAIN.  Although I
still feel that picking a different error code for the fallback would
be much more maintainable.

> +		xfs_iunlock(ip, iolock);
> +		dio_flags = IOMAP_DIO_FORCE_WAIT;

I notice the top of function comment mentions the IOMAP_DIO_FORCE_WAIT
flag.  Maybe use the chance to write a full sentence here or where
it is checked to explain the logic a bit better?

>   * Handle block unaligned direct I/O writes
>   *
> @@ -840,6 +909,10 @@ xfs_file_dio_write(
>  		return xfs_file_dio_write_unaligned(ip, iocb, from);
>  	if (xfs_is_zoned_inode(ip))
>  		return xfs_file_dio_write_zoned(ip, iocb, from);
> +
> +	if (iocb->ki_flags & IOCB_ATOMIC)
> +		return xfs_file_dio_write_atomic(ip, iocb, from);
> +

Either keep space between all the conditional calls or none.  I doubt
just stick to the existing style.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ