linux-ext4 - Re: [PATCH] e2fsprogs: Limit number of reserved gdt blocks on small fs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1503021832080.30128@localhost.localdomain>
Date:	Mon, 2 Mar 2015 18:43:03 +0100 (CET)
From:	Lukáš Czerner <lczerner@...hat.com>
To:	Eric Sandeen <sandeen@...hat.com>
cc:	linux-ext4@...r.kernel.org
Subject: Re: [PATCH] e2fsprogs: Limit number of reserved gdt blocks on small
 fs

On Mon, 2 Mar 2015, Eric Sandeen wrote:

> Date: Mon, 02 Mar 2015 11:27:11 -0600
> From: Eric Sandeen <sandeen@...hat.com>
> To: Lukáš Czerner <lczerner@...hat.com>
> Cc: linux-ext4@...r.kernel.org
> Subject: Re: [PATCH] e2fsprogs: Limit number of reserved gdt blocks on small
>     fs
> 
> On 3/2/15 11:08 AM, Lukáš Czerner wrote:
> > On Mon, 2 Mar 2015, Eric Sandeen wrote:
> 
> ...
> 
> >> Some extra "making sure" in the above sentence.  ;)
> > 
> > Right, I'll fix that.
> > 
> >>
> >>>                    reserved_gdt_blocks is small enough. If the user
> >>> +	 * specifies non-default values he can still run into this issue
> >>> +	 * but we do not want to make that mistake by default.
> >>> +	 *
> >>> +	 * Magic numbers:
> >>> +	 * - At block count smaller than 32768, journal will be 1024 block
> >>> +	 *   blocks long which would not be enough by default.
> >>> +	 * - 64 reserved gtd blocks is maximum number we can fit into said
> >>
> >> gtd?  (gdt?  gdb?)
> > 
> > Oh yes...
> > 
> > I'll try to use it consistently.
> 
> thanks :)
> 
> >>
> >>> +	 *   journal by default (flex_bg_count * 4 + rsv_gdb)
> >>
> >> This is a little confusing; "flex_bg_count" isn't a thing, and it seems to
> >> say that the reserved blocks depend on the number of reserved blocks?
> > 
> > Does it ? It says that number of journal credits depend on the
> > number of reserved gdt blocks, or maybe I am missing something.
> > 
> > flex_bg_count is a number of groups in flex group, I'll try to come
> > up with a better name.
> 
> It just looks like a variable name; it's one that doesn't exist, so it's
> not clear what you're talking about.
> 
> >>
> >> I'm curious, why does this matter only for 1k blocks, and not, say 2k blocks?
> > 
> > That's because of the way we compute the number of reserved GDT
> > blocks. The number of GDT blocks depends on the number of block
> > groups, which depends on the number of blocks in the file system.
> > And with 1k file system we have much more blocks. In fact we have:
> > 
> > - twice as many blocks as with 2k file system
> > - four times the block groups
> > 
> > And we can also fit only half of the number of group descriptors
> > into a single block which means that we need twice as many blocks
> > per block group.
> > 
> > Which means that with 1k file system we need 8 times as many gdt
> > blocks as with 2k fs and 2048 times more gdt blocks as with 4k with
> > the same setup.
> > 
> >>
> >>> +	 */
> >>> +	if ((fs->blocksize == 1024) && (ext2fs_blocks_count(sb) < 32768) &&
> >>> +	    (rsv_gdb > 64))
> >>> +		rsv_gdb = 64;
> >>> +
> >>>  	if (rsv_gdb > EXT2_ADDR_PER_BLOCK(sb))
> >>>  		rsv_gdb = EXT2_ADDR_PER_BLOCK(sb);
> >>>  #ifdef RES_GDT_DEBUG
> >>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> >>> index aeb852f..e9edd3b 100644
> >>> --- a/misc/mke2fs.c
> >>> +++ b/misc/mke2fs.c
> >>> @@ -3076,6 +3076,34 @@ int main (int argc, char *argv[])
> >>>  		}
> >>>  		if (!quiet)
> >>>  			printf("%s", _("done\n"));
> >>> +
> >>> +		/*
> >>> +		 * In online resize we require a huge number of journal
> >>> +		 * credits mainly because of the reserved_gdt_blocks. The
> >>> +		 * exact number of journal credits needed is:
> >>> +		 * flex_gd->count * 4 + reserved_gdb
> >>> +		 *
> >>> +		 * Warn user if we will not have enough credits to resize
> >>> +		 * the file system online.
> >>> +		 */
> >>> +		if (fs_param.s_feature_compat & EXT2_FEATURE_COMPAT_RESIZE_INODE)
> >>> +		{
> >>> +			unsigned int credits;
> >>> +
> >>> +			/*
> >>> +			* This is the amount we're allowed to use for a
> >>> +			* single handle in a transaction
> >>> +			*/
> >>> +			journal_blocks /= 8;
> >>
> >> This seems a little dangerous; if someone decides to use journal_blocks
> >> later in the function, it might be unexpectedly smaller.
> >>
> >>> +			credits = (1 << fs_param.s_log_groups_per_flex) * 4 +
> >>> +				  fs->super->s_reserved_gdt_blocks;
> >>> +			if (credits > journal_blocks) {
> >>
> >> so maybe just scale this by 8, i.e. "if (credits > journal_blocks/8)"
> > 
> > Fair enough, I'll change that.
> > 
> >>
> >> And actually, I find myself wondering if this same calculation can be used
> >> in calc_reserved_gdt_blocks, rather than using the magic numbers?
> > 
> > Unfortunately it can not. At the time we're calculating number of gdt
> > blocks we have no idea how big the journal will be.
> > 
> > And at the time we already know how big the journal will be we
> > already have blocks allocated for the resize_inode so we can not
> > just simply change s_reserved_gdt_blocks here.
> 
> Ugh.  That sounds like a mess.
> 
> But it feels like we need centralized functions or macros that express the
> relationships between these values, rather than sprinkling "* 4" and "if 1024"
> around.  It seems to have the potential to get out of sync, but I've only
> given it cursory thought...

Yes, but that would require huge amount of rewrite I think. Mostly
we're trying to think of the dependencies between various file
system features and/or parameters. I think it's mostly in main() and
PRS() functions however this "dependency" is entirely unexpected.

I tried to think about a better way where to put it, but this is what
I came with. However maybe we can determine the journal size by
ext2fs_default_journal_size() and use the default journal size
instead ?

-Lukas

> 
> -Eric
>