[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <70bc1a7f-1218-46d6-a8e8-96b667d393b7@oracle.com>
Date: Thu, 2 May 2024 10:12:36 +0100
From: John Garry <john.g.garry@...cle.com>
To: Dave Chinner <david@...morbit.com>
Cc: djwong@...nel.org, hch@....de, viro@...iv.linux.org.uk, brauner@...nel.org,
jack@...e.cz, chandan.babu@...cle.com, willy@...radead.org,
axboe@...nel.dk, martin.petersen@...cle.com,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tytso@....edu, jbongio@...gle.com, ojaswin@...ux.ibm.com,
ritesh.list@...il.com, mcgrof@...nel.org, p.raghav@...sung.com,
linux-xfs@...r.kernel.org, catherine.hoang@...cle.com
Subject: Re: [PATCH v3 17/21] iomap: Atomic write support
On 02/05/2024 02:43, Dave Chinner wrote:
> On Wed, May 01, 2024 at 12:08:34PM +0100, John Garry wrote:
>> On 01/05/2024 02:47, Dave Chinner wrote:
>>> On Mon, Apr 29, 2024 at 05:47:42PM +0000, John Garry wrote:
>>>> Support atomic writes by producing a single BIO with REQ_ATOMIC flag set.
>>>>
>>>> We rely on the FS to guarantee extent alignment, such that an atomic write
>>>> should never straddle two or more extents. The FS should also check for
>>>> validity of an atomic write length/alignment.
>>>>
>>>> Signed-off-by: John Garry <john.g.garry@...cle.com>
>>>> ---
> ...
>>>> +
>>>> bio->bi_private = dio;
>>>> bio->bi_end_io = iomap_dio_bio_end_io;
>>>> @@ -403,6 +407,12 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>>>> }
>>>> n = bio->bi_iter.bi_size;
>>>> + if (is_atomic && n != orig_count) {
>>>> + /* This bio should have covered the complete length */
>>>> + ret = -EINVAL;
>>>> + bio_put(bio);
>>>> + goto out;
>>>> + }
>>>
>>> What happens now if we've done zeroing IO before this? I suspect we
>>> might expose stale data if the partial block zeroing converts the
>>> unwritten extent in full...
>>
>> We use iomap_dio.ref to ensure that __iomap_dio_rw() does not return until
>> any zeroing and actual sub-io block write completes. See iomap_dio_zero() ->
>> iomap_dio_submit_bio() -> atomic_inc(&dio->ref) callchain. I meant to add
>> such info to the commit message, as you questioned this previously.
>
> Yes, I get that. But my point is that we may have only done -part-
> of a block unaligned IO.
>
> This is effectively a failure from a bio_iov_iter_get_pages() call.
> What does the bio_iov_iter_get_pages() failure case do that this new
> failure case not do? Why does this case have different failure
> handling?
>
So you are saying that if we fail here (that is the (is_atomic && n !=
orig_count) check), anything unwritten in the atomic write region and
zerotail region could expose stale data, right?
If so, I would say that we need to zero the complete unwritten atomic
write and zerotail regions - similar to the bio_iov_iter_get_pages()
failure handling - and still report an -EINVAL error.
How does that sound?
Thanks,
John
Powered by blists - more mailing lists