[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e9dc6f8-df1b-48f3-a9e0-f5f5507d92c1@oracle.com>
Date: Mon, 16 Sep 2024 10:44:38 +0100
From: John Garry <john.g.garry@...cle.com>
To: Dave Chinner <david@...morbit.com>
Cc: Ritesh Harjani <ritesh.list@...il.com>, chandan.babu@...cle.com,
djwong@...nel.org, dchinner@...hat.com, hch@....de,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, catherine.hoang@...cle.com,
martin.petersen@...cle.com
Subject: Re: [PATCH v4 00/14] forcealign for xfs
>> * I guess that you had not been following the recent discussion on this
>> topic in the latest xfs atomic writes series @ https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20240817094800.776408-1-john.g.garry@oracle.com/__;!!ACWV5N9M2RV99hQ!JIzCbkyp3JuPyzBx1n80WAdog5rLxMRB65FYrs1sFf3ei-wOdqrU_DZBE5zwrJXhrj949HSE0TwOEV0ciu8$
>> and also mentioned earlier in
>> https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20240726171358.GA27612@lst.de/__;!!ACWV5N9M2RV99hQ!JIzCbkyp3JuPyzBx1n80WAdog5rLxMRB65FYrs1sFf3ei-wOdqrU_DZBE5zwrJXhrj949HSE0TwOiiEnYSk$
>>
>> There I dropped the sub-alloc unit zeroing. The concept to iter for a single
>> bio seems sane, but as Darrick mentioned, we have issue of non-atomically
>> committing all the extent conversions.
>
> Yes, I understand these problems exist. My entire point is that the
> forced alignment implemention should never allow such unaligned
> extent patterns to be created in the first place. If we avoid
> creating such situations in the first place, then we never have to
> care about about unaligned unwritten extent conversion breaking
> atomic IO.
OK, but what about this situation with non-EOF unaligned extents:
# xfs_io -c "lsattr -v" mnt/file
[extsize, has-xattr, force-align] mnt/file
# xfs_io -c "extsize" mnt/file
[65536] mnt/file
#
# xfs_io -d -c "pwrite 64k 64k" mnt/file
# xfs_io -d -c "pwrite 8k 8k" mnt/file
# xfs_bmap -vvp mnt/file
mnt/file:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..15]: 384..399 0 (384..399) 16 010000
1: [16..31]: 400..415 0 (400..415) 16 000000
2: [32..127]: 416..511 0 (416..511) 96 010000
3: [128..255]: 256..383 0 (256..383) 128 000000
FLAG Values:
0010000 Unwritten preallocated extent
Here we have unaligned extents wrt extsize.
The sub-alloc unit zeroing would solve that - is that what you would
still advocate (to solve that issue)?
>
> FWIW, I also understand things are different if we are doing 128kB
> atomic writes on 16kB force aligned files. However, in this
> situation we are treating the 128kB atomic IO as eight individual
> 16kB atomic IOs that are physically contiguous.
Yes, if 16kB force aligned, userspace can only issue 16KB atomic writes.
> Hence in this
> situation it doesn't matter if we have a mix of 16kB aligned
> written/unwritten/hole extents as each 16kB chunks is independent of
> the others.
Sure
>
> What matters is that each indivudal 16kB chunk shows either the old
> data or the new data - we are not guaranteeing that the entire 128kB
> write is atomic. Hence in this situation we can both submit and
> process each 16kB shunk as independent IOs with independent IO
> compeltion transactions. All that matters is that we don't signal
> completion to userspace until all the IO is complete, and we already
> do that for fragmented DIO writes...
>
>>> Again, this is different to the traditional RT file behaviour - it
>>> can use unwritten extents for sub-alloc-unit alignment unmaps
>>> because the RT device can align file offset to any physical offset,
>>> and issue unaligned sector sized IO without any restrictions. Forced
>>> alignment does not have this freedom, and when we extend forced
>>> alignment to RT files, it will not have the freedom to use
>>> unwritten extents for sub-alloc-unit unmapping, either.
>>>
>> So how do you think that we should actually implement
>> xfs_itruncate_extents_flags() properly for forcealign? Would it simply be
>> like:
>>
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -1050,7 +1050,7 @@ xfs_itruncate_extents_flags(
>> WARN_ON_ONCE(first_unmap_block > XFS_MAX_FILEOFF);
>> return 0;
>> }
>> + if (xfs_inode_has_forcealign(ip))
>> + first_unmap_block = xfs_inode_roundup_alloc_unit(ip,
>> first_unmap_block);
>> error = xfs_bunmapi_range(&tp, ip, flags, first_unmap_block,
>
> Yes, it would be something like that, except it would have to be
> done before first_unmap_block is verified.
>
ok, and are you still of the opinion that this does not apply to rtvol?
Thanks,
John
Powered by blists - more mailing lists