lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090427021411.GA9059@mit.edu>
Date:	Sun, 26 Apr 2009 22:14:11 -0400
From:	Theodore Tso <tytso@....edu>
To:	Curt Wohlgemuth <curtw@...gle.com>
Cc:	Andreas Dilger <adilger@....com>,
	ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: Question on block group allocation

On Thu, Apr 23, 2009 at 03:02:05PM -0700, Curt Wohlgemuth wrote:
> > This is likely the "uninit_bg" feature that is causing the allocations
> > to skip groups which are marked BLOCK_UNINIT.  In some sense the benefit
> > of skipping the block bitmap read during e2fsck is probably not at all
> > beneficial compared to the cost of the extra seeking during IO.  As the
> > filesystem gets more full, the BLOCK_UNIIT flags would be cleared anyways,
> > so we might as well just keep the early allocations contiguous.

Well, I tried out Andreas' patch, by doing an rsync copy from my SSD
root partition to a 5400 rpm laptop drive, and then ran e2fsck and
dumpe2fs.  The results were interesting:

               Before Patch			  After Patch
	      Time in seconds			Time in seconds
	    Real /  User/  Sys   MB/s	   Real /  User/  Sys    MB/s	   
Pass 1      8.52 / 2.21 / 0.46  20.43	   8.84 / 4.97 / 1.11   19.68
Pass 2	   21.16 / 1.02 / 1.86  11.30	   6.54 / 1.77 / 1.78   36.39
Pass 3 	    0.01 / 0.00 / 0.00 139.00	   0.01 / 0.01 / 0.00  128.90
Pass 4	    0.16 / 0.15 / 0.00   0.00	   0.17 / 0.17 / 0.00    0.00
Pass 5	    2.52 / 1.99 / 0.09   0.79	   2.31 / 1.78 / 0.06	 0.86
Total	   32.40 / 5.11 / 2.49  12.81	  17.99 / 8.75 / 2.98	23.01

The surprise is in the gross inspection of the dumpe2fs results:

    	     	       	     Before Patch    After Patch
# of non-contig files  	     	762	        779
# of non-contig directories	571		570
# of BLOCK_UNINIT bg's		307		293
# of INODE_UNINIT bg's		503		503

So the interesting thing is that the patch only "broke open" an
additional 14 block groups (out of a 333 block groups in use when the
filesystem was created with the unpatched kernel).  However, this
allowed the pass 2 directory time to go *down* by over a factor of
three (from 21.2 seconds with the unpatched ext4 code to 6.5 seconds
with the the patch.

I think what the patch did was to diminish allocation pressure on the
first block group in the flex_bg, so we weren't mixing directory and
regular file contents.  This eliminated seeks during pass 2 of e2fsck,
which was actually a Very Good Thing.

> > A simple change to verify this would be something like the following,
> > but it hasn't actually been tested.
> 
> Tell you what:  I'll try this out and see if it helps out my test case.

Let me know what this does for your test case.  Hopefully the patch
also makes things better, since this patch is looking very interesting
right now.

Andreas, can I get a Signed-off-by from you for this patch? 

Thanks,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ