[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa9aa138-476d-413f-ac02-35156baacd66@suse.com>
Date: Mon, 6 May 2024 15:29:22 +0930
From: Qu Wenruo <wqu@...e.com>
To: Qu Wenruo <quwenruo.btrfs@....com>, Anand Jain <anand.jain@...cle.com>,
linux-btrfs@...r.kernel.org
Cc: dsterba@...e.com, y16267966@...il.com, linux-ext4@...r.kernel.org
Subject: Re: [PATCH v2 4/4] btrfs-progs: convert: support ext2 unwritten file
data extents
在 2024/5/6 15:11, Qu Wenruo 写道:
>
>
> 在 2024/5/6 12:34, Anand Jain 写道:
>> This patch, along with the dependent patches below, adds support for
>> ext4 unmerged unwritten file extents as preallocated file extent in
>> btrfs.
>>
>> btrfs-progs: convert: refactor ext2_create_file_extents add argument
>> ext2_inode
>> btrfs-progs: convert: struct blk_iterate_data, add ext2-specific
>> file inode pointers
>> btrfs-progs: convert: refactor __btrfs_record_file_extent to add a
>> prealloc flag
>>
>> The patch is developed with POV of portability with the current
>> e2fsprogs library.
>>
>> This patch will handle independent unwritten extents by marking them
>> with prealloc
>> flag and will identify merged unwritten extents, triggering a fail.
>>
>> Testcase:
>>
>> $ dd if=/dev/urandom of=/mnt/test/foo bs=4K count=1 conv=fsync
>> status=none
>> $ dd if=/dev/urandom of=/mnt/test/foo bs=4K count=2 conv=fsync
>> seek=1 status=none
>> $ xfs_io -f -c 'falloc -k 12K 12K' /mnt/test/foo
>> $ dd if=/dev/zero of=/mnt/test/foo bs=4K count=1 conv=fsync
>> seek=6 status=none
>>
>> $ filefrag -v /mnt/test/foo
>> Filesystem type is: ef53
>> File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
>> ext: logical_offset: physical_offset: length:
>> expected: flags:
>> 0: 0.. 0: 33280.. 33280: 1:
>> 1: 1.. 2: 33792.. 33793: 2: 33281:
>> 2: 3.. 5: 33281.. 33283: 3:
>> 33794: unwritten
>> 3: 6.. 6: 33794.. 33794: 1:
>> 33284: last,eof
>>
>> $ sha256sum /mnt/test/foo
>>
>> 18619b678a5c207a971a0aa931604f48162e307c57ecdec450d5f095fe9f32c7
>> /mnt/test/foo
>>
>> Convert and compare the checksum
>>
>> Before:
>>
>> $ filefrag -v /mnt/test/foo
>> Filesystem type is: 9123683e
>> File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
>> ext: logical_offset: physical_offset: length:
>> expected: flags:
>> 0: 0.. 0: 33280.. 33280:
>> 1: shared
>> 1: 1.. 2: 33792.. 33793: 2:
>> 33281: shared
>> 2: 3.. 6: 33281.. 33284: 4:
>> 33794: last,shared,eof
>> /mnt/test/foo: 3 extents found
>>
>> $ sha256sum /mnt/test/foo
>>
>> 6874a1733e5785682210d69c07f256f684cf5433c7235ed29848b4a4d52030e0
>> /mnt/test/foo
>>
>> After:
>>
>> $ filefrag -v /mnt/test/foo
>> Filesystem type is: 9123683e
>> File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
>> ext: logical_offset: physical_offset: length:
>> expected: flags:
>> 0: 0.. 0: 33280.. 33280:
>> 1: shared
>> 1: 1.. 2: 33792.. 33793: 2:
>> 33281: shared
>> 2: 3.. 5: 33281.. 33283: 3:
>> 33794: unwritten,shared
>> 3: 6.. 6: 33794.. 33794: 1:
>> 33284: last,shared,eof
>> /mnt/test/foo: 4 extents found
>>
>> $ sha256sum /mnt/test/foo
>>
>> 18619b678a5c207a971a0aa931604f48162e307c57ecdec450d5f095fe9f32c7
>> /mnt/test/foo
>>
>> Signed-off-by: Anand Jain <anand.jain@...cle.com>
>> ---
>> v2:
>>
>> . Remove RFC
>> . Identify the block with a merged preallocated extent and call fail-safe
>> . Qu has an idea that it could be marked as a hole, which may be based on
>> top of this patch.
>
> Well, my idea of going holes other than preallocated extents is mostly
> to avoid the extra @prealloc flag parameter.
>
> But that's not a big deal for now, as I found the following way to
> easily crack your v2 patchset:
>
> # fallocate -l 1G test.img
> # mkfs.ext4 -F test.img
> # mount test.img $mnt
> # xfs_io -f -c "falloc 0 16K" $mnt/file
> # sync
> # xfs_io -f -c "pwrite 0 4k" -c "pwrite 12k 4k" $mnt/file
> # umount $mnt
> # ./btrfs-convert test.img
> btrfs-convert from btrfs-progs v6.8
>
> Source filesystem:
> Type: ext2
> Label:
> Blocksize: 4096
> UUID: 0f98aa2a-b1ee-4e91-8815-9b9a7b4af00a
> Target filesystem:
> Label:
> Blocksize: 4096
> Nodesize: 16384
> UUID: 3b8db399-8e25-495b-a41c-47afcb672020
> Checksum: crc32c
> Features: extref, skinny-metadata, no-holes, free-space-tree
> (default)
> Data csum: yes
> Inline data: yes
> Copy xattr: yes
> Reported stats:
> Total space: 1073741824
> Free space: 872349696 (81.24%)
> Inode count: 65536
> Free inodes: 65523
> Block count: 262144
> Create initial btrfs filesystem
> Create ext2 image file
> Create btrfs metadata
> ERROR: inode 13 index 0: identified unsupported merged block length 1
> wanted 4
> ERROR: failed to copy ext2 inode 13: -22
> ERROR: error during copy_inodes -22
> WARNING: error during conversion, the original filesystem is not modified
>
> [...]
>> +
>> + memset(&extent, 0, sizeof(struct ext2fs_extent));
>> + if (ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent)) {
>> + error("ext2fs_extent_get EXT2_EXTENT_CURRENT failed inode %d",
>> + src->ext2_ino);
>> + ext2fs_extent_free(handle);
>> + return -EINVAL;
>> + }
>> +
>> + if (extent.e_pblk != data->disk_block) {
>> + error("inode %d index %d found wrong extent e_pblk %llu wanted
>> disk_block %llu",
>> + src->ext2_ino, index, extent.e_pblk, data->disk_block);
>> + ext2fs_extent_free(handle);
>> + return -EINVAL;
>> + }
>> +
>> + if (extent.e_len != data->num_blocks) {
>> + error("inode %d index %d: identified unsupported merged block
>> length %u wanted %llu",
>> + src->ext2_ino, index, extent.e_len, data->num_blocks);
>> + ext2fs_extent_free(handle);
>> + return -EINVAL;
>> + }
>
> You have to split the extent in this case. As the example I gave, part
> of the extent can have been written.
> (And I'm not sure if the e_pblk check is also correct)
>
> I believe the example I gave could be a pretty good test case.
> (Or you can go one step further to interleave every 4K)
Furthermore, I have to consider what is the best way to iterate all data
extents of an ext2 inode.
Instead of ext2fs_block_iterate2(), I'm wondering if
ext2fs_extent_goto() would be a better solution. (As long as if it can
handle holes).
Another thing is, please Cc this series to ext4 mailing list if possible.
I hope to get some feedback from the ext4 exports as they may have a
much better idea than us.
Thanks,
Qu
>
> Thanks,
> Qu
>
>> +
>> + if (extent.e_flags & EXT2_EXTENT_FLAGS_UNINIT)
>> + *has_unwritten = true;
>> +
>> + return 0;
>> +}
>> +
>> static int ext2_dir_iterate_proc(ext2_ino_t dir, int entry,
>> struct ext2_dir_entry *dirent,
>> int offset, int blocksize,
>> diff --git a/convert/source-ext2.h b/convert/source-ext2.h
>> index 026a7cad8ac8..19014d3c25e6 100644
>> --- a/convert/source-ext2.h
>> +++ b/convert/source-ext2.h
>> @@ -82,6 +82,9 @@ struct ext2_source_fs {
>> ext2_ino_t ext2_ino;
>> };
>>
>> +int ext2_find_unwritten(struct blk_iterate_data *data, int index,
>> + bool *has_unwritten);
>> +
>> #define EXT2_ACL_VERSION 0x0001
>>
>> #endif /* BTRFSCONVERT_EXT2 */
>> diff --git a/convert/source-fs.c b/convert/source-fs.c
>> index df5ce66caf7f..88a6ceaf41f6 100644
>> --- a/convert/source-fs.c
>> +++ b/convert/source-fs.c
>> @@ -31,6 +31,7 @@
>> #include "common/extent-tree-utils.h"
>> #include "convert/common.h"
>> #include "convert/source-fs.h"
>> +#include "convert/source-ext2.h"
>>
>> const struct simple_range btrfs_reserved_ranges[3] = {
>> { 0, SZ_1M },
>> @@ -239,6 +240,15 @@ fail:
>> return ret;
>> }
>>
>> +int find_prealloc(struct blk_iterate_data *data, int index,
>> + bool *has_prealloc)
>> +{
>> + if (data->source_fs)
>> + return ext2_find_unwritten(data, index, has_prealloc);
>> +
>> + return -EINVAL;
>> +}
>> +
>> /*
>> * Record a file extent in original filesystem into btrfs one.
>> * The special point is, old disk_block can point to a reserved range.
>> @@ -257,6 +267,7 @@ int record_file_blocks(struct blk_iterate_data *data,
>> u64 old_disk_bytenr = disk_block * sectorsize;
>> u64 num_bytes = num_blocks * sectorsize;
>> u64 cur_off = old_disk_bytenr;
>> + int index = data->first_block;
>>
>> /* Hole, pass it to record_file_extent directly */
>> if (old_disk_bytenr == 0)
>> @@ -276,6 +287,16 @@ int record_file_blocks(struct blk_iterate_data
>> *data,
>> u64 extent_num_bytes;
>> u64 real_disk_bytenr;
>> u64 cur_len;
>> + u64 flags = BTRFS_FILE_EXTENT_REG;
>> + bool has_prealloc = false;
>> +
>> + if (find_prealloc(data, index, &has_prealloc)) {
>> + data->errcode = -1;
>> + return -EINVAL;
>> + }
>> +
>> + if (has_prealloc)
>> + flags = BTRFS_FILE_EXTENT_PREALLOC;
>>
>> key.objectid = data->convert_ino;
>> key.type = BTRFS_EXTENT_DATA_KEY;
>> @@ -316,12 +337,12 @@ int record_file_blocks(struct blk_iterate_data
>> *data,
>> old_disk_bytenr + num_bytes) - cur_off;
>> ret = btrfs_record_file_extent(data->trans, data->root,
>> data->objectid, data->inode, file_pos,
>> - real_disk_bytenr, cur_len,
>> - BTRFS_FILE_EXTENT_REG);
>> + real_disk_bytenr, cur_len, flags);
>> if (ret < 0)
>> break;
>> cur_off += cur_len;
>> file_pos += cur_len;
>> + index++;
>>
>> /*
>> * No need to care about csum
>> diff --git a/convert/source-fs.h b/convert/source-fs.h
>> index 25916c65681b..db7ead422585 100644
>> --- a/convert/source-fs.h
>> +++ b/convert/source-fs.h
>> @@ -153,5 +153,6 @@ int read_disk_extent(struct btrfs_root *root, u64
>> bytenr,
>> u32 num_bytes, char *buffer);
>> int record_file_blocks(struct blk_iterate_data *data,
>> u64 file_block, u64 disk_block, u64 num_blocks);
>> +int find_prealloc(struct blk_iterate_data *data, int index, bool
>> *has_prealloc);
>>
>> #endif
Powered by blists - more mailing lists