lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 29 Apr 2021 14:47:36 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Mike Frysinger <vapier@...too.org>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: e4defrag seems too optimistic

On Apr 28, 2021, at 11:33 PM, Mike Frysinger <vapier@...too.org> wrote:
> 
> i started running e4defrag out of curiosity on some large files that i'm
> archiving long term.  its results seem exceedingly optimistic and i have
> a hard time agreeing with it.  am i pessimistic ?
> 
> for example, i have a ~4GB archive:
> $ e4defrag -c ./foo.tar.xz
> <File>                                         now/best       size/ext
> ./foo.tar.xz
>                                             39442/2             93 KB
> 
> Total/best extents				39442/2
> Average size per extent			93 KB
> Fragmentation score				34
> [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
> This file (./foo.tar.xz) does not need defragmentation.
> Done.
> 
> i have a real hard time seeing this file as barely "a little bit fragmented".
> shouldn't the fragmentation score be higher ?

I would tend to agree.  A 4GB file with 39k 100KB extents is not great.
On an HDD with 125 IOPS (not counting track buffers and such) this would
take about 300s to read at a whopping 13MB/s.  On flash, small writes do
lead to increased wear, but the seeks are free and you may not care.

IMHO, anything below 1MB/extent is sub-optimal in terms of IO performance,
and a sign of filesystem fragmentation (or a very poor IO pattern), since
mballoc should try to do allocation in 8MB chunks for large writes.

In many respects, if the extents are large enough, the "cost" of a seek
hidden by the device bandwidth (e.g. 250 MB/s / 125 seeks/sec = 2MB for
a good HDD today, scale linearly for RAID-5/6), so any extent larger than
this is not limited by seeks. Should 1024 x 4MB extents in a 4GB file be
considered fragmented or not?  Definitely 108KB/extent should be.

However, the "ideal = 2" case is bogus, since extents are max size 128MB,
so you would need at least 32 for a perfect 4GB file.  In that respect,
e4defrag is at best a "working prototype" but I don't think many people
use it, and has not gotten many improvements since it was first landed.
If you have a better idea for a "fragmentation score" I would be open
to looking at it, doubly so if it comes in the form of a patch.

You could check the actual file layout using "fallocate -v" before/after
running e4defrag to see how the allocation was changed.  This would tell
you if it is actually helping or not.  I've thought for a while that it
would be useful to add the same "fragmentation score" to filefrag, but
that would be contingent on the score actually making sense.

You can also use "e2freefrag" to check the filesystem as a whole to see
whether the free space is badly fragmented (i.e. most free chunks < 8MB).
In that case, running e4defrag _may_ help you, but it is not "smart" like
the old DOS defrag utilities, since it just rewrites each file separately
instead of having a "plan" for how to defrag the whole filesystem.

> as a measure of "how fragmented is it really", if i copy the file and then
> delete the original, there's a noticeable delay before `rm` finishes.

Yes, that would be totally clear if you ran filefrag on the file first.

Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ