[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54F49D6F.7080509@redhat.com>
Date: Mon, 02 Mar 2015 11:27:11 -0600
From: Eric Sandeen <sandeen@...hat.com>
To: Lukáš Czerner <lczerner@...hat.com>
CC: linux-ext4@...r.kernel.org
Subject: Re: [PATCH] e2fsprogs: Limit number of reserved gdt blocks on small
fs
On 3/2/15 11:08 AM, Lukáš Czerner wrote:
> On Mon, 2 Mar 2015, Eric Sandeen wrote:
...
>> Some extra "making sure" in the above sentence. ;)
>
> Right, I'll fix that.
>
>>
>>> reserved_gdt_blocks is small enough. If the user
>>> + * specifies non-default values he can still run into this issue
>>> + * but we do not want to make that mistake by default.
>>> + *
>>> + * Magic numbers:
>>> + * - At block count smaller than 32768, journal will be 1024 block
>>> + * blocks long which would not be enough by default.
>>> + * - 64 reserved gtd blocks is maximum number we can fit into said
>>
>> gtd? (gdt? gdb?)
>
> Oh yes...
>
> I'll try to use it consistently.
thanks :)
>>
>>> + * journal by default (flex_bg_count * 4 + rsv_gdb)
>>
>> This is a little confusing; "flex_bg_count" isn't a thing, and it seems to
>> say that the reserved blocks depend on the number of reserved blocks?
>
> Does it ? It says that number of journal credits depend on the
> number of reserved gdt blocks, or maybe I am missing something.
>
> flex_bg_count is a number of groups in flex group, I'll try to come
> up with a better name.
It just looks like a variable name; it's one that doesn't exist, so it's
not clear what you're talking about.
>>
>> I'm curious, why does this matter only for 1k blocks, and not, say 2k blocks?
>
> That's because of the way we compute the number of reserved GDT
> blocks. The number of GDT blocks depends on the number of block
> groups, which depends on the number of blocks in the file system.
> And with 1k file system we have much more blocks. In fact we have:
>
> - twice as many blocks as with 2k file system
> - four times the block groups
>
> And we can also fit only half of the number of group descriptors
> into a single block which means that we need twice as many blocks
> per block group.
>
> Which means that with 1k file system we need 8 times as many gdt
> blocks as with 2k fs and 2048 times more gdt blocks as with 4k with
> the same setup.
>
>>
>>> + */
>>> + if ((fs->blocksize == 1024) && (ext2fs_blocks_count(sb) < 32768) &&
>>> + (rsv_gdb > 64))
>>> + rsv_gdb = 64;
>>> +
>>> if (rsv_gdb > EXT2_ADDR_PER_BLOCK(sb))
>>> rsv_gdb = EXT2_ADDR_PER_BLOCK(sb);
>>> #ifdef RES_GDT_DEBUG
>>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
>>> index aeb852f..e9edd3b 100644
>>> --- a/misc/mke2fs.c
>>> +++ b/misc/mke2fs.c
>>> @@ -3076,6 +3076,34 @@ int main (int argc, char *argv[])
>>> }
>>> if (!quiet)
>>> printf("%s", _("done\n"));
>>> +
>>> + /*
>>> + * In online resize we require a huge number of journal
>>> + * credits mainly because of the reserved_gdt_blocks. The
>>> + * exact number of journal credits needed is:
>>> + * flex_gd->count * 4 + reserved_gdb
>>> + *
>>> + * Warn user if we will not have enough credits to resize
>>> + * the file system online.
>>> + */
>>> + if (fs_param.s_feature_compat & EXT2_FEATURE_COMPAT_RESIZE_INODE)
>>> + {
>>> + unsigned int credits;
>>> +
>>> + /*
>>> + * This is the amount we're allowed to use for a
>>> + * single handle in a transaction
>>> + */
>>> + journal_blocks /= 8;
>>
>> This seems a little dangerous; if someone decides to use journal_blocks
>> later in the function, it might be unexpectedly smaller.
>>
>>> + credits = (1 << fs_param.s_log_groups_per_flex) * 4 +
>>> + fs->super->s_reserved_gdt_blocks;
>>> + if (credits > journal_blocks) {
>>
>> so maybe just scale this by 8, i.e. "if (credits > journal_blocks/8)"
>
> Fair enough, I'll change that.
>
>>
>> And actually, I find myself wondering if this same calculation can be used
>> in calc_reserved_gdt_blocks, rather than using the magic numbers?
>
> Unfortunately it can not. At the time we're calculating number of gdt
> blocks we have no idea how big the journal will be.
>
> And at the time we already know how big the journal will be we
> already have blocks allocated for the resize_inode so we can not
> just simply change s_reserved_gdt_blocks here.
Ugh. That sounds like a mess.
But it feels like we need centralized functions or macros that express the
relationships between these values, rather than sprinkling "* 4" and "if 1024"
around. It seems to have the potential to get out of sync, but I've only
given it cursory thought...
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists