lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070815182252.GA14104@ucar.edu>
Date:	Wed, 15 Aug 2007 12:22:52 -0600
From:	Craig Ruff <cruff@...r.edu>
To:	Marc Perkel <mperkel@...oo.com>
Cc:	Kyle Moffett <mrmacman_g4@....com>,
	Michael Tharp <gxti@...tiallystapled.com>,
	alan <alan@...eserver.org>,
	LKML Kernel <linux-kernel@...r.kernel.org>,
	Lennart Sorensen <lsorense@...lub.uwaterloo.ca>
Subject: Re: Thinking outside the box on file systems

On Wed, Aug 15, 2007 at 10:30:19AM -0700, Marc Perkel wrote:
> --- Kyle Moffett <mrmacman_g4@....com> wrote:
> > Except they do, and without directories the
> > performance of your average filesystem is going to suck.
> 
> Actually you would get a speed improvement. You hash
> the full name and get the file number. You don't have
> to break up the name into sections except for
> evaluating name permissions.
> 
> The important concept here is that files and name
> aren't stored by levels of directories. The name
> points to the file number. Directory levels are
> emulated based on name separation characters or any
> other algorithm that you want to use.
> 
> One could create a file system and permission system
> that gets rid of the concept of directories entirely
> if one chooses to.

I would like to add support for Kyle's assertion.

The model described by Marc is exactly the method used by the current
version of the NCAR Mass Storage Service (MSS), which is data archive
of 4+ petabytes contained in 40+ million files.  To the user's point
of view, it looks somewhat like a POSIX file system with both some
extensions and deficiencies.  The MSS was designed in the mid-1980s,
in an era where the costs of the supercomputers (Cray-1s at that time)
were paramount.  This lead to some MSS design decisions to minimize the
need for users to rerun jobs on the expensive supercomputer just because
they messed up their MSS file creation statements.

Files names are a maximum of 128 bytes, with a dynamically managed
directory structure indicated by '/' characters in the name.  The file
name is hashed, and the hash table provides the internal file number (the
address in the Master File Directory (MFD)).  Any parent directories
are created automatically by the system upon file creation, and are
automatically deleted if empty upon file deletion.  Directories also
have a self pointer, and both files and directories are chained together
to allow the user to list (or otherwise manipulate) the contents of
a directory.

The biggest problem with this model is that to manipulate the a directory
itself, you have to simulate the operation on all of the files contained
within it.  For example to rename a directory with 'n' descendants,
you must perform:

	n+1 hash table removals
	n+1 hash table insertions (with collision detection)
	n+1 MFD record updates
	1   directory chain removal
	1   directory chain insertion

This is, needless to say, very painful when n is large.  Since users
must use directory trees to efficiently manage their data holdings,
efficient directory manipulation is essential.  Contrast this with
the number of operations required for a directory rename if files
do not record their complete pathname:

	1 directory chain removal
	1 directory chain insertion

Fortunately we are currently working to change from using a model like
Marc describes to one Kyle describes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ