linux-kernel - Re: [PATCH v6 11/13] xfs: add xfs_file_dio_write

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <08992e02-9ff4-416e-bd6c-e3e016356200@oracle.com>
Date: Tue, 18 Mar 2025 08:42:36 +0000
From: John Garry <john.g.garry@...cle.com>
To: Christoph Hellwig <hch@....de>
Cc: brauner@...nel.org, djwong@...nel.org, cem@...nel.org, dchinner@...hat.com,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com,
        ritesh.list@...il.com, martin.petersen@...cle.com, tytso@....edu,
        linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 11/13] xfs: add xfs_file_dio_write_atomic()

On 18/03/2025 05:43, Christoph Hellwig wrote:
> On Mon, Mar 17, 2025 at 09:36:13AM +0000, John Garry wrote:
>>> It is only preferred if actually supported by the underlying hardware.
>>> If it isn't it really shouldn't even be tried, as that is just a waste
>>> of cycles.
>>
>> We should not even call this function if atomics are not supported by HW -
>> please see IOCB_ATOMIC checks in xfs_file_write_iter(). So maybe I will
>> mention that the caller must ensure atomics are supported for the write
>> size.
> 
> I see that this is what's done in the current series now.  But that feels
> very wrong.  Why do you want to deprive the user of this nice and useful
> code if they don't have the right hardware? 

I don't think it's fair to say that we deprive the user - so far we just 
don't and nobody has asked for atomics without HW support.

> Why do we limit us to the
> hardware supported size when we support more in software? 

As I see, HW offload gives fast and predictable performance.

The COW method is just a (slow) fallback is when HW offload is not possible.

If we want to allow the user to avail of atomics greater than the 
mounted bdev, then we should have a method to tell the user of the 
optimised threshold. They could read the bdev atomic limits and infer 
this, but that is not a good user experience.

> How do you
> force test the software code if you require the hardware support?
> 
>>>> +	trace_xfs_file_direct_write(iocb, from);
>>>> +	ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
>>>> +			dio_flags, NULL, 0);
>>>
>>> The normal direct I/O path downgrades the iolock to shared before
>>> doing the I/O here.  Why isn't that done here?
>>
>> OK, I can do that. But we still require exclusive lock always for the
>> CoW-based method.
> 
> If you can do away with the lock that's great and useful to get good
> performance.  But if not at least document why this is different from
> others.  Similarly if the COW path needs an exclusive lock document why
> in the code.

ok, I'll do that.

> 
> 
>>
>>>
>>>> +	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
>>>> +	    dops == &xfs_direct_write_iomap_ops) {
>>>
>>> This should probably explain the unusual use of EGAIN.  Although I
>>> still feel that picking a different error code for the fallback would
>>> be much more maintainable.
>>
>> I could try another error code - can you suggest one? Is it going to be
>> something unrelated to storage stack, like EREMOTEIO?
> 
> Yes, the funky networking codes tends to be good candidates.  E.g.
> ENOPROTOOPT for something that sounds at least vaguely related.

ok

> 
>>>> +
>>>> +	if (iocb->ki_flags & IOCB_ATOMIC)
>>>> +		return xfs_file_dio_write_atomic(ip, iocb, from);
>>>> +
>>>
>>> Either keep space between all the conditional calls or none.  I doubt
>>> just stick to the existing style.
>>
>> Sure
> 
> FYI, that I doubt should have been in doubt.  I was just so happy to
> finally get the mail out after a flakey connection on the train.
> 

thanks