[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0810062025301.19471@asgard.lang.hm>
Date: Mon, 6 Oct 2008 20:37:59 -0700 (PDT)
From: david@...g.hm
To: Mikulas Patocka <mpatocka@...hat.com>
cc: Nick Piggin <nickpiggin@...oo.com.au>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, agk@...hat.com, mbroz@...hat.com,
chris@...chsys.com
Subject: Re: application syncing options (was Re: [PATCH] Memory management
livelock)
On Sun, 5 Oct 2008, Mikulas Patocka wrote:
> On Sun, 5 Oct 2008, david@...g.hm wrote:
>
>> On Sun, 5 Oct 2008, Mikulas Patocka wrote:
>>
>>> On Fri, 3 Oct 2008, david@...g.hm wrote:
>>>
>>>> I've also seen discussions of how the
>>>> kernel filesystem code can do ordered writes without having to wait for
>>>> them
>>>> with the use of barriers, is this capability exported to userspace? if so,
>>>> could you point me at documentation for it?
>>>
>>> It isn't. And it is good that it isn't --- the more complicated API, the
>>> more maintenance work.
>>
>> I can understand that most software would not want to deal with complications
>> like this, but for things thta have requirements similar to journaling
>> filesystems (databases for example) it would seem that there would be
>> advantages to exposing this capabilities.
>>
>> David Lang
>
> If you invent new interface that allows submitting several ordered IOs
> from userspace, it will require excessive maintenance overhead over long
> period of time. So it should be only justified, if the performance
> improvement is excessive as well.
>
> It should not be like "here you improve 10% performance on some synthetic
> benchmark in one application that was rewritten to support the new
> interface" and then create a few more security vulnerabilities (because of
> the complexity of the interface) and damage overall Linux progress,
> because everyone is catching bugs in the new interface and checking it for
> correctness.
the same benchmarks that show that it's far better for the in-kernel
filesystem code to use write barriers should apply for FUSE filesystems.
this isn't a matter of a few % in performance, if an application is
sync-limited in a way that can be converted to write-ordered the potential
is for the application to speed up my many times.
programs that maintain indexes or caches of data that lives in other files
will be able to write data && barrier && write index && fsync and double
their performance vs write data && fsync && write index && fsync
databases can potentially do even better, today they need to fsync data to
disk before they can update their journal to indicate that the data has
been written, with a barrier they could order the writes so that the write
to the journal doesn't happen until the writes of the data. they would
neve need to call an fsync at all (when emptying the journal)
for systems without solid-state drives or battery-backed caches, the
ability to eliminate fsyncs by being able to rely on the order of the
writes is a huge benifit.
David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists