lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pr9npdlc.fsf@meyering.net>
Date:	Sat, 19 Sep 2009 10:01:51 +0200
From:	Jim Meyering <jim@...ering.net>
To:	Theodore Tso <tytso@....edu>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: efficient access to "rotational";  new fcntl?

Theodore Tso wrote:
> On Fri, Sep 18, 2009 at 09:31:50PM +0200, Jim Meyering wrote:
>>     chgrp, chmod, chown, chcon, du, rm: now all display linear performance,
>>     even when operating on million-entry directories on ext3 and ext4 file
>>     systems.  Before, they would exhibit O(N^2) performance, due to linear
>>     per-entry seek time cost when operating on entries in readdir order.
>>     Rm was improved directly, while the others inherit the improvement
>>     from the newer version of fts in gnulib.
>
> Excellent!  I didn't know that (since my userspace is still Ubuntu
> 9.04, which is still using coreutils 6.10).

Heh.  Time to upgrade.
With the upcoming coreutils-7.7, I've removed a quadratic
component in rm -r (without -f), and rewrote it to give
rm -rf an additional 4-5x speed-up in some nasty cases.

>> However, with e.g., an ext4 partition on non-rotational hardware like
>> an SSD, that preprocessing is unnecessary and in fact wasted effort.
>> I'd like to avoid the waste by querying the equivalent of
>> /sys/.../rotational, via a syscall like fcntl or statvfs,
>> given a file descriptor.
>
> Have you benchmarked it both ways?  The preprocessing will cost some
> extra CPU time, sure, but for a sufficiently large directory, or if
> the user is deleting a very large directory hierarchy, such that "rm
> -rf" spans multiple journal transactions, deleting the files in inode
> order will still avoid some filesystem metadata blocks getting written
> multiple times (which for SSD's, especially the crappier ones with
> nasty write amplification factors) could show a performance impact.

Yeah, I mentioned I should do exactly that on IRC yesterday.
I've just run some tests, and see that at least with one SSD (OCZ Summit
120GB), the 0.5s cost of sorting pays off handsomely with a 12-x speed-up,
saving 5.5 minutes, when removing a 1-million-empty-file directory.

----------------------------------------
Timing rm -rf million-file-dir vs. ext4 on a 120GB OCZ Summit on Fedora 11
This is using the very latest rm/remove.c from coreutils.git.
The one rewritten to use fts.

Creation took about 63 seconds:
    mkdir d;(cd d && seq 1000000|xargs touch)

Removal with inode-sort preprocessing (the 0.543s is sort duration):
  $ env time ./rm -rf d
  0.543050295
  1.62user 20.13system 0:28.25elapsed 77%CPU (0avgtext+0avgdata 0maxresident)k
  9968inputs+8outputs (0major+74445minor)pagefaults 0swaps

2nd trial: (create million-file dir)
  $ mkdir d;(cd d && seq 1000000|env time xargs touch)
  0.63user 62.14system 1:06.49elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
  40inputs+16outputs (1major+19701minor)pagefaults 0swaps

Remove it:
  $ env time ./rm -rf d
  0.570515343
  1.72user 18.49system 0:26.45elapsed 76%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+8outputs (0major+74445minor)pagefaults 0swaps
---------------------------------------------

Repeating, but with fts' sort-on-inode disabled:
  ouch. It would have taken about 6 minutes.
  I killed it after ~3, when it had removed half of the entries.

Conclusion:
  Even on an SSD, this sort-on-inode preprocessing gives more
  than a 10-x speed-up when removing a 1-million-empty-file directory.
  Hence, fts does not need access to the "rotational" bit, after all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ