lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D17DE0D.2070504@ontolinux.com>
Date:	Mon, 27 Dec 2010 01:30:05 +0100
From:	Christian Stroetmann <stroetmann@...olinux.com>
To:	Ted Ts'o <tytso@....edu>
CC:	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-ext4@...r.kernel.org,
	Olaf van der Spek <olafvdspek@...il.com>,
	Nick Piggin <npiggin@...il.com>
Subject: Re: Atomic non-durable file write API

On the 26.12.2010 23:10, Ted Ts'o wrote:
> On Sun, Dec 26, 2010 at 07:51:23PM +0100, Olaf van der Spek wrote:
>
<snip>
> As I said earlier, "file systems are not databases", and "databases
> are not file systems".  Oracle tried to foist their database as a file
> system during the dot.com boom, and everyone laughed at them; the
> performance was a nightmare.  If Oracle wasn't able to make a
> transaction engine that supports transactions and rollbacks
> performant, you really expect that you'll be able to do it?

An FS could easily have the rest of the functions of a database 
management system (DBMS) as an FSDB, a hybrid if you wish. An example 
for such a hybrid is the ext2/3-sqlite FS and there are two little 
architectural problems only: One is related with the structure and 
naming scheme of the api and the other is related with the handling of 
the FS caching by the programmer and the user due to the many different 
options available.

Furthermore, the performance of Oracle's solutions was and still is so 
low, because they have a file system as a database that is managed by a 
DBMS as a file that again is stored in an FS. Can you see now what does 
the loss of performance?
And Oracle fears FSs like R4 that have database(-like) functionalities, 
so it took those technical features of R4 for the BTRFS, which they 
thought could stop its show.
And also, Oracle has started some months ago again to promote its FS in 
a DB in an FS concept.

So, there must be something that is highly interesting with the idea to 
use an FS as DBMS, not only for Oracle, but at least for the four 
largest software companies.

<snip>
>
>> Providing transaction semantics for multiple files is a far broader
>> proposal and not necessary for implement this proposal.
> But providing magic transaction semantics for a single file in the
> rename is not at all clearly useful.  You need to justify all of this
> hard effort, and performance loss.  (Well, or if you're so smart you
> can implement your own file system that does all of this work, and we
> can benchmark it against a file system that doesn't do all of this
> work....)

But then the benchmark must be done correctly, which means that the FS 
without transaction must be used with a transaction mechanism by an 
additional software component. Otherwise the benchmarking would be worth 
nothing.

>> I'm not sure, but Ted appears to be saying temp file + rename (but no
>> fsync) isn't guaranteed to work either.
> It won't work if you get really unlucky and your system takes a power
> cut right at the wrong moment during or after the rename().  It could
> be made to work, but at a performance cost.  And the question is
> whether the performance cost is worth it.  At the end of the day it's
> all between the tradeoff between performance cost, implementation
> cost, and value to the user and the application programmer.  Which is
> why you need to articular the use case where this makes sense.

see above

> It's not dpkg, and it's not file editors.  What is it, specifically?
> And why can it tolerate data loss in the case of quota overruns and
> wireless connection hits, but not in the case of system crashes?
>
>> It just seems quite suboptimal. There's no need for infinite storage
>> (or an oracle) to avoid this.
> If you're so smart, why don't you try implementing it?  Itt's going to
> be hard for us to convince you why it's going to be non-trivial and
> have huge implementation *and* performance costs,

see above

>   so why don't you
> produce the patches that makes this all work?
>
> 						- Ted
>

Christian Stroetmann

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ