lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140130235044.31064.38113.stgit@birch.djwong.org>
Date:	Thu, 30 Jan 2014 15:50:45 -0800
From:	"Darrick J. Wong" <darrick.wong@...cle.com>
To:	tytso@....edu, darrick.wong@...cle.com
Cc:	linux-ext4@...r.kernel.org
Subject: [INSANE RFC PATCH 0/2] e2fsck metadata prefetch

This is a patchset that tries to reduce e2fsck run times by pre-loading
ext4 metadata concurrent with e2fsck execution.  The first patch
implements a mmap-based IO manager that mmaps the underlying device
and uses a simple memcpy to read and write data.  The second patch
extends libext2fs and e2fsck to have a prefetch utility.  If the mmap
IO manager is active, the prefetcher spawns a bunch of threads
(_NPROCESSORS_ONLN by default) which scan semi-sequentially across the
disk trying to fault in pages before the main e2fsck thread needs the
data.

(If the unix IO manager is active, it settles for forking and using
the regular read calls to pull the metadata into the page cache.  My
efforts have concentrated almost entirely on the threaded mmap
prefetch.)

Each prefetch thread T, of N threads total, reads the directory
blocks, extent tree blocks, and inodes of the group (T + (N * i));
it's hoped that this will keep the IO queues saturated with requests
for fairly close-by data.  Obviously, the success of this scheme also
depends on having enough free memory that things stick around in
memory long enough for e2fsck to visit.  MADV_WILLNEED might help; I
haven't tried this yet.

Crude testing has been done via:
# echo 3 > /proc/sys/vm/drop_caches
# PREFETCH=1 TEST_MMAP_IO=1 /usr/bin/time ./e2fsck/e2fsck -Fnfvtt /dev/XXX

So far in my crude testing on a cold system, I've seen about a 15-20%
speedup on a SSD, a 10-15% speedup on a 3x RAID1 SATA array, and maybe
a 5% speedup on a single-spindle SATA disk.  On a single-queue USB
HDD, performance regresses some 200% as the disk thrashes itself
towards an early grave.  It looks as though in general, single-spindle
HDDs will suffer this effect, which doesn't surprise me.  I've not had
time to investigate if having a single prefetch thread yields any
advantage.

On a warm system the speedups are much more modest -- 5% or less in
all cases (except the USB HDD, which still sucks).

There's also the minor problem that e2fsck will crash in malloc as
soon as it tries to make any changes to the disk.  So far this means
that we're limited to quick preening, but I'll work on fixing this.

I've tested these e2fsprogs changes against the -next branch as of
1/16.  These days, I use an 8GB ramdisk and a 20T "disk" I constructed
out of dm-snapshot to test in an x64 VM.  The make check tests should
pass.

Comments and questions are, as always, welcome.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ