[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08992e02-9ff4-416e-bd6c-e3e016356200@oracle.com>
Date: Tue, 18 Mar 2025 08:42:36 +0000
From: John Garry <john.g.garry@...cle.com>
To: Christoph Hellwig <hch@....de>
Cc: brauner@...nel.org, djwong@...nel.org, cem@...nel.org, dchinner@...hat.com,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com,
ritesh.list@...il.com, martin.petersen@...cle.com, tytso@....edu,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 11/13] xfs: add xfs_file_dio_write_atomic()
On 18/03/2025 05:43, Christoph Hellwig wrote:
> On Mon, Mar 17, 2025 at 09:36:13AM +0000, John Garry wrote:
>>> It is only preferred if actually supported by the underlying hardware.
>>> If it isn't it really shouldn't even be tried, as that is just a waste
>>> of cycles.
>>
>> We should not even call this function if atomics are not supported by HW -
>> please see IOCB_ATOMIC checks in xfs_file_write_iter(). So maybe I will
>> mention that the caller must ensure atomics are supported for the write
>> size.
>
> I see that this is what's done in the current series now. But that feels
> very wrong. Why do you want to deprive the user of this nice and useful
> code if they don't have the right hardware?
I don't think it's fair to say that we deprive the user - so far we just
don't and nobody has asked for atomics without HW support.
> Why do we limit us to the
> hardware supported size when we support more in software?
As I see, HW offload gives fast and predictable performance.
The COW method is just a (slow) fallback is when HW offload is not possible.
If we want to allow the user to avail of atomics greater than the
mounted bdev, then we should have a method to tell the user of the
optimised threshold. They could read the bdev atomic limits and infer
this, but that is not a good user experience.
> How do you
> force test the software code if you require the hardware support?
>
>>>> + trace_xfs_file_direct_write(iocb, from);
>>>> + ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
>>>> + dio_flags, NULL, 0);
>>>
>>> The normal direct I/O path downgrades the iolock to shared before
>>> doing the I/O here. Why isn't that done here?
>>
>> OK, I can do that. But we still require exclusive lock always for the
>> CoW-based method.
>
> If you can do away with the lock that's great and useful to get good
> performance. But if not at least document why this is different from
> others. Similarly if the COW path needs an exclusive lock document why
> in the code.
ok, I'll do that.
>
>
>>
>>>
>>>> + if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
>>>> + dops == &xfs_direct_write_iomap_ops) {
>>>
>>> This should probably explain the unusual use of EGAIN. Although I
>>> still feel that picking a different error code for the fallback would
>>> be much more maintainable.
>>
>> I could try another error code - can you suggest one? Is it going to be
>> something unrelated to storage stack, like EREMOTEIO?
>
> Yes, the funky networking codes tends to be good candidates. E.g.
> ENOPROTOOPT for something that sounds at least vaguely related.
ok
>
>>>> +
>>>> + if (iocb->ki_flags & IOCB_ATOMIC)
>>>> + return xfs_file_dio_write_atomic(ip, iocb, from);
>>>> +
>>>
>>> Either keep space between all the conditional calls or none. I doubt
>>> just stick to the existing style.
>>
>> Sure
>
> FYI, that I doubt should have been in doubt. I was just so happy to
> finally get the mail out after a flakey connection on the train.
>
thanks
Powered by blists - more mailing lists