[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200205124750.AE9DDA404D@d06av23.portsmouth.uk.ibm.com>
Date: Wed, 5 Feb 2020 18:17:44 +0530
From: Ritesh Harjani <riteshh@...ux.ibm.com>
To: "Darrick J. Wong" <darrick.wong@...cle.com>
Cc: jack@...e.cz, tytso@....edu, adilger.kernel@...ger.ca,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
hch@...radead.org, cmaiolino@...hat.com
Subject: Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
On 1/30/20 11:04 PM, Ritesh Harjani wrote:
>
>
> On 1/30/20 9:30 PM, Darrick J. Wong wrote:
>> On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
>>> Hello All,
>>>
>>> Background
>>> ==========
>>> There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap
>>> APIs.
>>> This reduces the users of ext4_get_block API and thus a step towards
>>> getting
>>> rid of buffer_heads from ext4. Also reduces a lot of code by making
>>> use of
>>> existing iomap_ops (except for xattr implementation).
>>>
>>> Testing (done on ext4 master branch)
>>> ========
>>> 'xfstests -g auto' passes with default mkfs/mount configuration
>>> (v/s which also pass with vanilla kernel without this patch). Except
>>> generic/473 which also failes on XFS. This seems to be the test case
>>> issue
>>> since it expects the data in slightly different way as compared to
>>> what iomap
>>> returns.
>>> Point 2.a below describes more about this.
>>>
>>> Observations/Review required
>>> ============================
>>> 1. bmap related old v/s new method differences:-
>>> a. In case if addr > INT_MAX, it issues a warning and
>>> returns 0 as the block no. While earlier it used to return the
>>> truncated value with no warning.
>>
>> Good...
>>
>>> b. block no. is only returned in case of iomap->type is
>>> IOMAP_MAPPED,
>>> but not when iomap->type is IOMAP_UNWRITTEN. While with
>>> previously
>>> we used to get block no. for both of above cases.
>>
>> Assuming the only remaining usecase of bmap is to tell old bootloaders
>> where to find vmlinuz blocks on disk, I don't see much reason to map
>> unwritten blocks -- there's no data there, and if your bootloader writes
>> to the filesystem(!) then you can't read whatever was written there
>> anyway.
>
> Yes, no objection there. Just wanted to get it reviewed.
>
>
>>
>> Uh, can we put this ioctl on the deprecation list, please? :)
>>
>>> 2. Fiemap related old v/s new method differences:-
>>> a. iomap_fiemap returns the disk extent information in exact
>>> correspondence with start of user requested logical offset
>>> till the
>>> length requested by user. While in previous implementation the
>>> returned information used to give the complete extent
>>> information if
>>> the range requested by user lies in between the extent mapping.
>>
>> This is a topic of much disagreement. The FIEMAP documentation says
>> that the call must return data for the requested range, but *may* return
>> a mapping that extends beyond the requested range.
>>
>> XFS (and now iomap) only return data for the requested range, whereas
>> ext4 has (had?) the behavior you describe. generic/473 was an attempt
>> to enforce the ext4 behavior across all filesystems, but I put it in my
>> dead list and never run it.
>>
>> So it's a behavioral change, but the new behavior isn't forbidden.
>
> Sure, thanks.
>
>>
>>> b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
>>> fiemap_extent mapping range requested by the user via fm_length (
>>> if that has a valid mapped extent on the disk).
>>
>> That sounds like a bug. _LAST is supposed to be set on the last extent
>> in the file, not the last record in the queried dataset.
>
> Thought so too, sure will spend some time to try fixing it.
Looked into this. I think below should fix our above reported problem
with current iomap code.
If no objection I will send send PATCHv3 with below fix as the first
patch in the series.
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index bccf305ea9ce..ee53991810d5 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
fiemap_extent_info *fi,
}
if (ctx.prev.type != IOMAP_HOLE) {
- ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
+ u32 flags = 0;
+ loff_t isize = i_size_read(inode);
+
+ if (ctx.prev.offset + ctx.prev.length >= isize)
+ flags |= FIEMAP_EXTENT_LAST;
+ ret = iomap_to_fiemap(fi, &ctx.prev, flags);
if (ret < 0)
return ret;
}
-ritesh
>
>
>>
>>> But if the user
>>> requested for more fm_length which could not be mapped in the
>>> last
>>> fiemap_extent, then the flag is not set.
>>
>> Yes... if there were more extents to map than there was space in the map
>> array, then _LAST should remain unset to encourage userspace to come
>> back for the rest of the mappings.
>>
>> (Unless maybe I'm misunderstanding here...)
>>
>>> e.g. output for above differences 2.a & 2.b
>>> ===========================================
>>> create a file with below cmds.
>>> 1. fallocate -o 0 -l 8K testfile.txt
>>> 2. xfs_io -c "pwrite 8K 8K" testfile.txt
>>> 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
>>> 4. run this binary on with and without these patches:- ./a.out
>>> (test_fiemap_diff.c) [4]
>>>
>>> o/p of xfs_io -c "fiemap -v"
>>> ============================================
>>> With this patch on patched kernel:-
>>> testfile.txt:
>>> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
>>> 0: [0..15]: 122802736..122802751 16 0x800
>>> 1: [16..31]: 122687536..122687551 16 0x1
>>>
>>> without patch on vanilla kernel (no difference):-
>>> testfile.txt:
>>> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
>>> 0: [0..15]: 332211376..332211391 16 0x800
>>> 1: [16..31]: 332722392..332722407 16 0x1
>>>
>>>
>>> o/p of a.out without patch:-
>>> ================
>>> riteshh-> ./a.out
>>> logical: [ 0.. 15] phys: 332211376..332211391 flags: 0x800
>>> tot: 16
>>> (0) extent flag = 2048
>>>
>>> o/p of a.out with patch (both point 2.a & 2.b could be seen)
>>> =======================
>>> riteshh-> ./a.out
>>> logical: [ 0.. 7] phys: 122802736..122802743 flags: 0x801
>>> tot: 8
>>> (0) extent flag = 2049
>>>
>>> FYI - In test_fiemap_diff.c test we had
>>> a. fm_extent_count = 1
>>> b. fm_start = 0
>>> c. fm_length = 4K
>>> Whereas when we change fm_extent_count = 32, then we don't see any
>>> difference.
>>>
>>> e.g. output for above difference listed in point 1.b
>>> ====================================================
>>>
>>> o/p without patch (block no returned for unwritten block as well)
>>> =========Testing IOCTL FIBMAP=========
>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>> 0 41526422
>>> 1 41526423
>>> 2 41590299
>>> 3 41590300
>>>
>>> o/p with patch (0 returned for unwritten block)
>>> =========Testing IOCTL FIBMAP=========
>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>> 0 0 0
>>> 1 0 0
>>> 2 15335942 29953
>>> 3 15335943 29953
>>>
>>>
>>> Summary:-
>>> ========
>>> Due to some of the observational differences to user, listed above,
>>> requesting to please help with a careful review in moving this to iomap.
>>> Digging into some older threads, it looks like these differences
>>> should be fine,
>>> since the same tools have been working fine with XFS (which uses
>>> iomap based
>>> implementation) [1]
>>> Also as Ted suggested in [3]: Fiemap & bmap spec could be made based
>>> on the ext4
>>> implementation. But since all the tools also work with xfs which uses
>>> iomap
>>> based fiemap, so we should be good there.
>>
>> <nod> Thanks for the worked example output. :)
>
> Thanks for the review. :)
>
> ritesh
>
>
>>
>> --D
>>
>>>
>>> References of some previous discussions:
>>> =======================================
>>> [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
>>> [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
>>> [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
>>> [4]:
>>> https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
>>>
>>> [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html
>>>
>>>
>>> Ritesh Harjani (4):
>>> ext4: Add IOMAP_F_MERGED for non-extent based mapping
>>> ext4: Optimize ext4_ext_precache for 0 depth
>>> ext4: Move ext4 bmap to use iomap infrastructure.
>>> ext4: Move ext4_fiemap to use iomap infrastructure
>>>
>>> fs/ext4/extents.c | 288 +++++++---------------------------------------
>>> fs/ext4/inline.c | 41 -------
>>> fs/ext4/inode.c | 6 +-
>>> 3 files changed, 49 insertions(+), 286 deletions(-)
>>>
>>> --
>>> 2.21.0
>>>
Powered by blists - more mailing lists