linux-ext4 - Re: [PATCH 3/6] mke2fs: set block_validity as a default mount option

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140825155218.GA22645@birch.djwong.org>
Date:	Mon, 25 Aug 2014 08:52:18 -0700
From:	"Darrick J. Wong" <darrick.wong@...cle.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: [PATCH 3/6] mke2fs: set block_validity as a default mount option

On Sun, Aug 24, 2014 at 06:47:21PM -0400, Theodore Ts'o wrote:
> On Fri, Aug 08, 2014 at 09:26:30PM -0700, Darrick J. Wong wrote:
> > The block_validity mount option spot-checks block allocations against
> > a bitmap of known group metadata blocks.  This helps us to prevent
> > self-inflicted catastrophic failures such as trying to "share"
> > critical metadata (think bitmaps) with file data, which usually
> > results in filesystem destruction.
> > 
> > In order to test the overhead of the mount option, I re-used the speed
> > tests in the metadata checksum testing script.  In short, the program
> > creates what looks like 15 copies of a kernel source tree, except that
> > it uses fallocate to strip out the overhead of writing the file data
> > so that we can focus on metadata overhead.  On a 64G RAM disk, the
> > overhead was generally about 0.9% and at most 1.6%.  On a 160G USB
> > disk, the overhead was about 0.8% and peaked at 1.2%.
> 
> I was doing a spot check of the additional memory impact of
> block_validity mount option, and it's for a 20T file system, assuming
> the basic flex_bg size of 16 block groups, it's a bit over 400k of
> kernel memory.  That's not a *huge* amount of memory, but it could
> potentially be noticeable on a bookshelf NAS server.
> 
> However, I could imagine that for a system with say, two dozen 10T
> drives (which aren't that far off in the future) in a tray, that's
> around 4 megabytes of memory, which starts being non-trivial.
> 
> That being said, I suspect for most users, it's not that big of a deal
> --- so maybe this is something we should just simply enable by default
> in the kernel, let those folks who want to disable specify a
> noblock_validity mount option.

Should there be a noblock_validity default mount option?

I suppose I can simply send in a one-liner making b_v the kernel default and
see if anyone screams....

> The other thing to consider is that for big raid arrays, maybe we
> should use a larger flex_bg size.  The main reason for keeping the
> size small is to minimize the seek time between the inode table and a
> block in the flex_bg.  But for raid devices, we could probably afford
> to increase flex_bg size, which would decrease the numer of system
> zones that the block validity code would need to track.

One could make the default flexbg size = 16 * stride_width / stripe_width as a
start.

--D
> 
>       	       	     	      	   	      - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html