linux-ext4 - Re: Maildir quickly hitting max htree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2111161753010.26337@stax.localdomain>
Date:   Tue, 16 Nov 2021 19:31:10 +0000 (GMT)
From:   Mark Hills <mark@...x.org>
To:     Theodore Ts'o <tytso@....edu>
cc:     Andreas Dilger <adilger@...ger.ca>, linux-ext4@...r.kernel.org
Subject: Re: Maildir quickly hitting max htree

On Sun, 14 Nov 2021, Theodore Ts'o wrote:

> On Sat, Nov 13, 2021 at 12:05:07PM +0000, Mark Hills wrote:
> > 
> > Interesting! The 1Kb block size was not explicitly chosen. There was no 
> > plan other than using the defaults.
> > 
> > However I did forget that this is a VM installed from a base image. The 
> > root cause is likely to be that the /home partition has been enlarged from 
> > a small size to 32Gb.
> 
> How small was the base image?

/home was created with 256Mb, never shrunk.

> As documented in the man page for mke2fs.conf, for file systems that are 
> smaller than 3mb, mke2fs use the parameters in /etc/mke2fs.conf for type 
> "floppy" (back when 3.5 inch floppies were either 1.44MB or 2.88MB).  
> So it must have been a really tiny base image to begin with.

Small, but not microscopic :)

I see a definition in mke2fs.conf for "small" which uses 1024 blocksize, 
and I assume it originated there and not "floppy".

> > These days I think VMs make it more common to enlarge a filesystem from a 
> > small size. We could have picked this up earlier with a warning from 
> > resize2fs; eg. if the block size will no longer match the one that would 
> > be chosen by default. That would pick it up before anyone puts 1Kb block 
> > size into production.
> 
> It's would be a bit tricky for resize2fs to do that, since it doesn't
> know what might be in the mke2fs.conf file at the time when the file
> system when the file system was creaeted.  Distributions or individual
> system adminsitrators are free to modify that config file.

No need to time travel back -- it's complicated, and actually less 
relevant?

I haven't looked at resize2fs code, so this comes just from a user's 
point-of-view but... if it is already reading mke2fs.conf, it could make 
comparisons using an equivalent new filesystem as benchmark.

In the spirit of eg. "your resized filesystem will have a block size of 
1024, but a new filesystem of this size would use 4096"

Then you can compare any absolute metric of the filesystem that way.

The advantage being...

> It is a good idea for resize2fs to give a warning, though.  What I'm 
> thinking that what might sense is if resize2fs is expanding the file 
> system by more than, say a factor of 10x (e.g., expanding a file system 
> from 10mb to 100mb, or 3mb to 20gb)

... that the benchmark gives you a comparison that won't drift. eg. if you 
resize by +90% several times.

And reflects any desires that may be in the configuration.

> to give a warning that inflating file systems is an anti-pattern that 
> will not necessarily result in the best file system performance.

I imagine it's not a panacea, but it would be good to be more concrete on 
what the gotchas are; "bad performance" is vague, and since the tool 
exists it must be possible to use it properly.

I'll need to consult the docs, but so far have been made aware of:

* block size
  (which has knock-on effect to file limits per directory)

* journal size
  (not in configuration file -- can this be adjusted?)

* files get fragmented when shrinking a filesystem
  (but this is similar to any full file system?)

These are all things I'm generally aware of and their implications, just 
easy to miss when you're busy and focused on other aspects (completely 
escaped me that the filesystem had been enlarged when I began this 
thread!)

That's why the patch in the other thread is not a bad idea; just reminding 
that block size is relevant.

For info, our use case here is the base image used to deploy persistent 
VMs which use very different disk sizes. The base image is build using 
packer+QEMU managed as code. Then written using "dd" and LVM partitions 
expanded without needing to go single-user or take the system offline. 
This method is appealling because it allows to pre-populate /home with 
some small amount of data; SSH keys etc.

For the case that started this thread, we just wiped the filesystem and 
made a new one at the target size of 32Gb.

> Even if the blocksize isn't 1k, when a file system is shrunk
[...more on shrinking]

Many thanks,

-- 
Mark