lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Dec 2020 15:13:06 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     brookxu <brookxu.cn@...il.com>
Cc:     adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org
Subject: Re: [PATCH RESEND 4/8] ext4: add the gdt block of meta_bg to
 system_zone

You did your test on a 80T file system, but that's not where someone
would be using meta_bg.  Meta_bg ges used for much larger file systems
than that!  With meta_bg, we have 3 block group descriptors every 64
block groups.  Each block group describes 128M of memory.  So for that
means we are going to have 3 entries in the system zone tree for every_
8GB of file system space, 383,216 entries for every PB.  Given that
each entry is 40 bytes, that means that the block_validity entries
will consume 15 megabytes per PB.

Now, one third of these entries overlap with the flex_bg entries
(meta_bg groups are in the first, second, and last block group of each
meta_bg, where are 64 block groups in 4k file systems), and of course,
the default flex_bg size of 16 block groups means that there are
524,288 entries per PB.  So if we include all backup sb and block
groups, in a 1 PB file system, there will be roughly 786,432 entries
in a 1 PB file system.  (I'm ignoring the entries for the backup
superblocks, but that's only about 20 or so extra entries.)

So for a flex_bg 1PB file system, the amount of memory for a
block_validity data structure is roughly 20M, and including all backup
descriptors for meta_bg on a flex_bg + meta_bg setup is roughly 30M.

I agree with you that for a non-meta_bg file system, including all of
the backup superblock and block group descriptors is not going to be
large.  But while protecting the meta_bg group descriptors is
worthwhile, protecting the backup meta_bg's is not free, and will
increase the size of the tree by 33%.

I'm also wondering whether or not Lustre (where they do have some file
systems that are in the PB range) have run into overhead issues with
block_validity.

What do folks think?

						- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ