lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Mar 2011 15:08:53 -0400
From:	Phillip Susi <psusi@....rr.com>
To:	Ted Ts'o <tytso@....edu>
CC:	Eric Sandeen <sandeen@...hat.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: Large directories and poor order correlation

On 3/15/2011 1:08 PM, Ted Ts'o wrote:
> No, because the directory blocks are the leaf nodes, and in the case
> of a node split, we need to copy half of the directory entries in one
> block, and move it to a newly allocated block.  If readdir() was
> traversing the linear directory entries, and had already traversed
> that directory block that needs to split, then you'll return those
> directory entries that got copied into a new leaf (i.e., new directory
> block) a second time.

When you split the htree node, aren't you just moving around the
"deleted entries"?  So the normal names remain in the same order so
readdir() doesn't have a problem when it is ignoring the htree entries
and just walking the normal names?

Also, how do you deal with this when you do end up re balancing the
htree during a readdir()?  I would think that keeping that straight
would be much more difficult than handling the problem with linear
directory entries.

Why was the htree hidden inside the normal directory structure anyway?

> Unless some files get deleted in between.  Now depending on the
> "holes" in the directory blocks, where the new directory entries are
> added, even in the non-htree case, could either be wherever an empty
> directory entry could be found, or in the worst case, we might need to
> allocate a new block and that new directory entry gets added to the
> end of the block.

Right, but on an otherwise idle system, when you make all the files at
once via rsync or untaring an archive, this shouldn't happen and they
should be ( generally ) in ascending order, shouldn't they?

> I suggest that you try some experiments, using both dir_index and
> non-dir_index file systems, and then looking at the directory using
> the debugfs "ls" and "htree_dump" commands.  Either believe me, or
> learn how things really work.  :-)

Now THAT sounds interesting.  Is this documented somewhere?

Also, why can't chattr set/clear the 'I' flag?  Is it just a runtime
combersome thing?  So setting and clearing the flag with debugfs
followed by a fsck should do the trick?  And when is it automatically
enabled?

> I suppose we could allocate up to some tunable amount worth of
> directory space, say 64k or 128k, and do the sorting inside the
> kernel.  We then have to live with the fact that each badly behaved
> program which calls opendir(), and then a single readdir(), and then
> stops, will consume 128k of non-swappable kernel memory until the
> process gets killed.  A process which does this thousands of times
> could potentially carry out a resource exhaustion attack on the
> system.  Which we could then try to patch over, by say creating a new
> resource limit of the number of directories a process can keep open at
> a time, but then the attacker could just fork some additional child
> processes....

I think you are right in that if sorting is to be done at
opendir()/readdir() time, then it should be done in libc, not the
kernel, but it would be even better if the fs made some effort store the
entries in a good order so no sorting is needed at all.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ