lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070619075857.GA2944@brong.net>
Date:	Tue, 19 Jun 2007 17:58:57 +1000
From:	Bron Gondwana <brong@...tmail.fm>
To:	Kyle Moffett <mrmacman_g4@....com>
Cc:	Bryan Henderson <hbryan@...ibm.com>,
	Jack Stone <jack@...keye.stone.uk.eu.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	alan <alan@...eserver.org>, "H. Peter Anvin" <hpa@...or.com>,
	linux-fsdevel@...r.kernel.org,
	LKML Kernel <linux-kernel@...r.kernel.org>,
	Al Viro <viro@...iv.linux.org.uk>, git@...r.kernel.org
Subject: Re: Versioning file system

On Mon, Jun 18, 2007 at 11:10:42PM -0400, Kyle Moffett wrote:
> On Jun 18, 2007, at 13:56:05, Bryan Henderson wrote:
>>> The question remains is where to implement versioning: directly in 
>>> individual filesystems or in the vfs code so all filesystems can use it?
>>
>> Or not in the kernel at all.  I've been doing versioning of the types I 
>> described for years with user space code and I don't remember feeling that 
>> I compromised in order not to involve the kernel.
>
> What I think would be particularly interesting in this domain is something 
> similar in concept to GIT, except in a file-system:

I've written a couple of user-space things very much like this - one
being a purely database (blobs in database, yeah I know) system for
managing medical data, where signatures and auditability were the most
important part of the system.  Performance really wasn't a
consideration.

The other one is my current job, FastMail - we have a virtual filesystem
which uses files stored by sha1 on ordainary filesystems for data
storage and a database for metadata (filename to sha1 mappings, mtime,
mimetype, directory structure, etc).

Multiple machine distribution is handled by a daemon on each machine
which can be asked to make sure the file gets sent out to every machine
that matches the prefix and will only return success once it's written
to at least one other machine.  Database replication is a different
beast.


It can work, but there's one big pain at the file level: no mmap.

If you don't want to support mmap it can work reasonably happily, though
you may want to keep your sha1 (or other digest) state as well as the
final digest so you can cheaply calculate the digest for a small append
without walking the entire file.  You may also want to keep state
checkpoints every so often along a big file so that truncates don't cost
too much to recalculate.

Luckily in a userspace VFS that's only accessed via FTP and DAV we can
support a limited set of operations (basically create, append, read,
delete)  You don't get that luxury for a general purpose filesystem, and
that's the problem.  There will always be particular usage patterns
(especially something that mmaps or seeks and touches all over the place
like a loopback mounted filesystem or a database file) that just dodn't
work for file-level sha1s.


It does have some lovely properties though.  I'd enjoy working in an
envionment that didn't look much like POSIX but had the strong
guarantees and auditability that addressing by sha1 buys you.

Bron.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ