[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53C74316.1050902@cn.fujitsu.com>
Date: Thu, 17 Jul 2014 11:29:26 +0800
From: Xiaoguang Wang <wangxg.fnst@...fujitsu.com>
To: "Darrick J. Wong" <darrick.wong@...cle.com>
CC: <linux-ext4@...r.kernel.org>, "Theodore Ts'o" <tytso@....edu>,
<a-fujita@...jp.nec.com>, <t-sato@...jp.nec.com>,
<sandeen@...hat.com>
Subject: Re: Question about e2fsprogs/e4defrag
Hi,
On 07/17/2014 03:22 AM, Darrick J. Wong wrote:
> On Wed, Jul 16, 2014 at 12:16:37PM +0800, Xiaoguang Wang wrote:
>> Hi,
>>
>> When I run xfstests/tests/generic/018 for ext4 file system in RHEL7.0GA,
>> sometimes it fails and sometime it succeeds. After looking into this case, I
>> think it's not a kernel ext4 bug, it maybe an e4dfrag bug. I compiled the
>> newest e2fsprogs to have test, it seems this issue still exits, so I still
>> send a mail to this list to look for some help, thanks.
>>
>> The issue is that sometimes e4defrag does not defrag file correctly.
>> Steps to reproduce this issue:
>> 1. cd mntpoint
>> 2. rm -f lege
>> 3. for I in `seq 9 -1 0`;
>> do dd if=/dev/zero of=lege bs=4k count=1 conv=notrunc seek=$I oflag=sync &>/dev/null;
>> done;
>> 4. e4defrag -c -v lege
>>
>> Repeatedly execute the 2, 3, 4 steps until you get a file which have the similar extent layout like below:
>> ################################################################
>> [root@...alhost test_e2fsprogs]# e4defrag -c -v lege
>> <File>
>> [ext 1]: start 49365571: logical 0: len 1
>> [ext 2]: start 49365570: logical 1: len 1
>> [ext 3]: start 49365569: logical 2: len 1
>> [ext 4]: start 49365568: logical 3: len 1
>> [ext 5]: start 49365567: logical 4: len 1
>> [ext 6]: start 49365566: logical 5: len 1
>> [ext 7]: start 49365565: logical 6: len 1
>> [ext 8]: start 49365564: logical 7: len 1
>> [ext 9]: start 49365563: logical 8: len 1
>> [ext 10]: start 49365562: logical 9: len 1
>>
>> Total/best extents 10/1
>> Average size per extent 4 KB
>> Fragmentation score 98
>> [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>> This file (lege) needs defragmentation.
>> Done.
>> ################################################################
>> The physical blocks are continuous but reversed.
>>
>> If we call e4defrag against this file, the output would be:
>> ################################################################
>> [root@...alhost test_e2fsprogs]# /tmp/e4defrag -v lege
>> ext4 defragmentation for lege
>> [1/1]lege: 100% extents: 10 -> 10 [ OK ]
>> Success: [1/1]
>> [root@...alhost test_e2fsprogs]# /tmp/e4defrag -v -c lege
>> <File>
>> [ext 1]: start 49365571: logical 0: len 1
>> [ext 2]: start 49365570: logical 1: len 1
>> [ext 3]: start 49365569: logical 2: len 1
>> [ext 4]: start 49365568: logical 3: len 1
>> [ext 5]: start 49365567: logical 4: len 1
>> [ext 6]: start 49365566: logical 5: len 1
>> [ext 7]: start 49365565: logical 6: len 1
>> [ext 8]: start 49365564: logical 7: len 1
>> [ext 9]: start 49365563: logical 8: len 1
>> [ext 10]: start 49365562: logical 9: len 1
>>
>> Total/best extents 10/1
>> Average size per extent 4 KB
>> Fragmentation score 98
>> [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>> This file (lege) needs defragmentation.
>> Done.
>> ################################################################
>> According to my understanding, this file is not defraged correctly and should
>> be convert into one extent. Or because if the physical blocks are continuous
>> though reversed, we do not need to do defragment?
>
> Oh, I think we /do/ need to defragment. Granted, file readahead might paper
> over the symptoms, but since the user explicitly ran e4defrag we can try
> to do better.
Yeah, agree.
>
>> I have checked the e4defrag source code, whether to do real defragment
>> depends on some conditions, please
>> see this code(e4defrag.c).
>> --main
>> --file_defrag
>>
>> In file_defrag(), there is such a judgement:
>> "if (file_frags_start <= best || orig_physical_cnt <= donor_physical_cnt)", If this returns true, the e4defrag will
>> not call call_defrag() to do real defragment work.
>>
>> Here file_frags_start: number of file fragments before defrag
>> orig_physical_cnt: number of original file's continuous physical region
>> donor_physical_cnt: number of donor file's continuous physical region
>>
>> In this "lege" file, the orig_physical_cnt is 1, and donor_physical_cnt is also 1, so the "if" is satisfied and
>> call_defrag() won't be called.
>
> This is a curious corner case of e4defrag -- if you look in get_file_extents(),
> the list of extents is insertion-sorted by physical block, which means that
> get_physical_count() (stupidly) looks only for gaps in the runs of physical
> blocks. Therefore, e4defrag thinks that this "lege" file has one physical
> extent. Ignoring logical block ordering, this is true, but as you point out,
> this leaves the "file written backwards" case in a fragmented state. So let's
> not ignore the logical block ordering:
>
> What I think we really need to do here is make get_physical_count() smarter --
> if there's a gap either in the physical or logical offsets of extents, then we
> need to increment *_physical_cnt so that we later decide to defragment the
> file.
>
> (Please keep reading)
I checked the code again, you are right, thanks for your explanation
>
>> Here I'd like to know the comparison "orig_physical_cnt <=
>> donor_physical_cnt" is useful? According to my understanding, what should we
>> have comparison are number of extents or average extent size.
>>
>> When I have this change:
>> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
>> index a204793..cd95698 100644
>> --- a/misc/e4defrag.c
>> +++ b/misc/e4defrag.c
>> @@ -1598,8 +1598,7 @@ check_improvement:
>> extents_before_defrag += file_frags_start;
>> }
>>
>> - if (file_frags_start <= best ||
>> - orig_physical_cnt <= donor_physical_cnt) {
>> + if (file_frags_start <= best) {
>
> This is incorrect, since the point of the "orig_physical_cnt <=
> donor_physical_cnt" check is to ensure that we don't increase the fragmentation
> of a file by swapping it with pieces from a donor file whose contents are
> spread out over a larger number of runs of physical blocks.
Ah, I see. I hadn't realized that, thanks.
>
> (It does, however, force defragmentation for all files, so you get the results
> you wanted.)
>
> Please try the patch at the end of this message on for size. It fixes things
> on my test VM; does it fix yours?
Yeah, it works, thanks.
Would you send a new version patch to fix this issue, or should I do it?
Regards,
Xiaoguang Wang
>
> --D
>
>> printf("\033[79;0H\033[K[%u/%u]%s:\t%3d%%",
>> defraged_file_count, total_count, file, 100);
>> if (mode_flag & DETAIL)
>>
>> Then the "lege" file could be defraged correctly.
>> ##################################################################
>> [root@...alhost test_e2fsprogs]# /tmp/e4defrag -v lege
>> ext4 defragmentation for lege
>> [1/1]lege: 100% extents: 10 -> 1 [ OK ]
>> Success: [1/1]
>> [root@...alhost test_e2fsprogs]# /tmp/e4defrag -v -c lege
>> <File>
>> [ext 1]: start 49366583: logical 0: len 10
>>
>> Total/best extents 1/1
>> Average size per extent 40 KB
>> Fragmentation score 0
>> [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>> This file (lege) does not need defragmentation.
>> Done.
>> ##################################################################
>>
>> Any opinion or suggestions will be appreciated!
>> If I'm wrong, please correct me, thanks!
>>
>> Regards,
>> Xiaoguang Wang
>
> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
> index a204793..d0eac60 100644
> --- a/misc/e4defrag.c
> +++ b/misc/e4defrag.c
> @@ -888,7 +888,9 @@ static int get_physical_count(struct fiemap_extent_list *physical_list_head)
>
> do {
> if ((ext_list_tmp->data.physical + ext_list_tmp->data.len)
> - != ext_list_tmp->next->data.physical) {
> + != ext_list_tmp->next->data.physical ||
> + (ext_list_tmp->data.logical + ext_list_tmp->data.len)
> + != ext_list_tmp->next->data.logical) {
> /* This extent and next extent are not continuous. */
> ret++;
> }
> .
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists