lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 28 Jan 2014 14:02:02 -0700
From:	Andreas Dilger <adilger@...ger.ca>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	Theodore Ts'o <tytso@....edu>, Masato Minda <minmin@...s.co.jp>,
	Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: How many files to create in one directory?

On Jan 27, 2014, at 12:48 PM, Eric Sandeen <sandeen@...hat.com> wrote:
> On 1/27/14, 1:39 PM, Theodore Ts'o wrote:
>>> It will depend on the length of the filenames.  But by my calculations,
>>> for average 28-char filenames, it's closer to 30 million.

Note that there is also a 2GB directory size limit imposed by not using
i_size_high for directories.  That works out to be about:

  (2^30 bytes / 4096 bytes/block) *
  ((4096 bytes/block / (28 + 4 + 4 bytes/entry)) * 0.75 full) ~= 22M entries

We have a patch that allows using i_size_high for directories, and
adding 3rd level htree support for small block filesystems or very
large directories.  However, we haven't written e2fsck support for
it and it isn't currently enabled.

If someone is interested in taking a look at this:
http://git.whamcloud.com/?p=fs/lustre-release.git;a=blob;f=ldiskfs/kernel_patches/patches/sles11sp2/ext4-pdirop.patch;h=4d2acffadaa31a1bdd9f3a592cda71dfcdd585a4;hb=HEAD

The "htree lock" part of the patch is for allowing parallel
create/lookup/unlink access to the large directory, but last time
I asked Al Viro about this he didn't seem interested in exporting
that functionality to the VFS.

>> Note that there will be some very significant performance problems
>> well before a directory gets that big.  For example, just simply doing
>> a readdir + stat on all of the files in that directory (or a readdir +
>> unlink, etc.) will very likely result in extremely unacceptable
>> performance.
> 
> Yep, that's the max possible, not the max useable.  ;)

In newer kernels it is also possible to put an upper limit on the size
of a directory via /sys/fs/ext4/{dev}/max_dir_size_kb tunable or mount
option.  This prevents users from creating directories that are so big
they can't be handled by normal tools.

> (Although, I'm not sure in practice what max useable looks like, TBH).

We regularly test with 10M files per directory.  Obviously, workloads
that do this do not use "ls -l" or equivalent, but just lookup-by-name
from within applications.  It is usable in our testing up to about 15M
entries before there can start being problems with level-2 leaf blocks
getting full (due to uneven usage of the leaf blocks).

Cheers, Andreas

>> So if you can find some other way of avoiding allowing the file system
>> that big (i.e., using a real database instead of trying to use a file
>> system as a database, etc.), I'd strongly suggest that you consider
>> those alternatives.
>> 
>> Regards,
>> 
>> 					- Ted
>> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ