linux-kernel - Re: Solaris ZFS on Linux [Was: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5c49b0ed0607311705t1eb8fc6bs9a68a43059bfa91a@mail.gmail.com>
Date:	Mon, 31 Jul 2006 17:05:51 -0700
From:	"Nate Diller" <nate.diller@...il.com>
To:	"David Lang" <dlang@...italinsight.com>
Cc:	"Matthias Andree" <matthias.andree@....de>,
	"Adrian Ulrich" <reiser4@...nkenlights.ch>,
	"Horst H. von Brand" <vonbrand@....utfsm.cl>, ipso@...ppymail.ca,
	reiser@...esys.com, lkml@...productions.com, jeff@...zik.org,
	tytso@....edu, linux-kernel@...r.kernel.org,
	reiserfs-list@...esys.com
Subject: Re: Solaris ZFS on Linux [Was: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion]

On 7/31/06, David Lang <dlang@...italinsight.com> wrote:
> On Mon, 31 Jul 2006, Nate Diller wrote:
>
> > On 7/31/06, David Lang <dlang@...italinsight.com> wrote:
> >> On Mon, 31 Jul 2006, Nate Diller wrote:
> >>
> >> >
> >> > On 7/31/06, Matthias Andree <matthias.andree@....de> wrote:
> >> >> Adrian Ulrich wrote:
> >> >>
> >> >> > See also: http://spam.workaround.ch/dull/postmark.txt
> >> >> >
> >> >> > A quick'n'dirty ZFS-vs-UFS-vs-Reiser3-vs-Reiser4-vs-Ext3 'benchmark'
> >> >>
> >> >> Whatever Postmark does, this looks pretty besides the point.
> >> >
> >> > why's that?  postmark is one of the standard benchmarks...
> >> >
> >> >> Are these actual transactions with the "D"urability guarantee?
> >> >> 3000/s doesn't look too much like you're doing synchronous I/O (else
> >> >> figures around 70/s perhaps 100/s would be more adequate), and cache
> >> >> exercise is rather irrelevant for databases that manage real (=valuable)
> >> >> data...
> >> >
> >> > Data:
> >> >       204.62 megabytes read (8.53 megabytes per second)
> >> >       271.49 megabytes written (11.31 megabytes per second)
> >> >
> >> > looks pretty I/O bound to me, 11.31 MB/s isn't exactly your latest DDR
> >> > RAM bandwidth.  as far as the synchronous I/O question, Reiser4 in
> >> > this case acts more like a log-based FS.  That allows it to "overlap"
> >> > synchronous operations that are being submitted by multiple threads.
> >>
> >> what you are missing is that apps that need to do lots of syncing
> >> (databases,
> >> mail servers) need to wait for the data to hit non-volitile media before
> >> the
> >> write is complete. this limits such apps to ~1 write per revolution of the
> >> platters (yes it's possible for a limited time to have multiple writes to
> >> different things happen to be on the same track, but the counter is the
> >> extra
> >> seek time needed between tracks)
> >
> > this is true so long as there is only one thread submitting I/O and
> > doing fsync().  for something like a mail server, it can run
> > multi-threaded, and still get data integrity, if the changes are
> > spread out across more than one file.
>
> only if those multiple files all happen to live (along with their metadata) on
> the same track.

this is only a limitation for filesystems which do in-place data and
metadata updates.  this is why i mentioned the similarities to log
file systems (see rosenblum and ousterhout, 1991).  they observed an
order-of-magnitude increase in performance for such workloads on their
system.

> >> so any benchmark that shows more transactions then the media has
> >> revolutions is
> >> highly suspect (now if you have battery-backed cache, or the equivalent you
> >> can
> >> blow past these limits)
> >
> > not all workloads are completely serial, transactions themselves may
> > have no inter-dependencies at all.  so it depends on the benchmark,
> > and what workload you're measuring.  in cases like this, threading can
> > have a big advantage.
>
> in the real-world (and benchmarks that simulate it fairly) the data spans
> multiple tracks so your best case is considerably less then the max I listed
> becouse you frequently have to seek around a lot to do your writes to multiple
> places on disk. more threads running should mean that you are attempting to
> write to more places on disk, which will cause more seeks, dropping you further
> below the max.

postmark is very much real world.  reiser4 just doesn't always do
in-place writes.

NATE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/