lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTi=hYUsAejN7J_QEnvsA-jOfoTP0PA@mail.gmail.com>
Date:	Sun, 19 Jun 2011 15:55:47 -0400
From:	Greg Freemyer <greg.freemyer@...il.com>
To:	Kazuya Mio <k-mio@...jp.nec.com>
Cc:	ext4 <linux-ext4@...r.kernel.org>, Theodore Tso <tytso@....edu>
Subject: Re: [PATCH 01/11 RESEND] libe2p: Add new function get_fragment_score()

2011/6/15 Kazuya Mio <k-mio@...jp.nec.com>:
> This patch adds get_fragment_score() to libe2p. get_fragment_score() returns
> the fragmentation score. It shows the percentage of extents whose size is
> smaller than the input argument "threshold".
>
> However, there are some cases that cannot be merged into a next extent.
> The following extents are excepted from the calculation of fragmentation score:
> - The extent whose initialize status is different from the next extent
> - There is a hole between the extent and its next extent
> - The extent is a tail
>
> Signed-off-by: Kazuya Mio <k-mio@...jp.nec.com>
> ---

Kazuya,

If you're are going to stick with a percentage based approach I agree
with special casing the tail of the file or the tail of each
block_group in a sparse file.

But I think you should also special case the head extent of a block_group.

More clearly, whatever special treatment that is afforded to the
tail_extent of a block_group, should also be afforded to head_extent.

Maybe something like this for sparse files:

M_total=0
N_total=0
For (each block group) {
   M = extents in block group below threshold
   N = extents in block group above threshold

   if first_extent is below threshold  {  --M; }  # don't count it as
below threshold
   if last_extent is below threshold  {  --M; }  # don't count it as
below threshold

   M_total +=M
   N_totatl += N

   block_group_fragmentation_score = M / (N+M)
}
file_fragmentation_score = M_total / (N_total + M_total)

===> overall e4defrag design feedback

e4defrag makes the decision to defrag sparse files at the full file
level last I looked and I believe it still does.

I think it should treat each non-sparse file and each block_group of a
sparse file separately as a target for defragmentation.

The patch you are submitting just embeds the "file" defragmentation
methodology further into the code.  It very much seems to me that the
e4defrag code and scoring needs to work at the block_group size at the
largest.

That is for sparse files it should "score" a block_group at a time at
most.  And then it should compare the target's block_group layout vs.
the donor files block_group layout.

Once it is decided to defragment a block_group, then I think it should
fallocate the largest available extent (that may require a kernel
patch).

If that newly allocated extent is bigger than the first extent in the
block_group, then call into the kernel to swap out the target blocks
for the donor blocks.  Then move to the next target extent and repeat.

For a large file, making the overall defrag choice at the block_group
level just seems right, but for actually replacing target extents with
donor extents, I really think that needs to be decided one extent at a
time as I describe above.

After all, what is the value taking a block group with a 100 full
extents followed by two partial extents and moving all of that data
around in order to defrag it.  Only the last 2 extents need to be
consolidated.

Using the approach I described above, the code would quickly bypass
the first 100 full extents and then just defrag the last 2 partial
extents.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ