[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45BCC17A.9090302@tmr.com>
Date: Sun, 28 Jan 2007 10:30:02 -0500
From: Bill Davidsen <davidsen@....com>
To: linux-kernel@...r.kernel.org
Cc: 7eggert@....de, Michael Tokarev <mjt@....msk.ru>,
Phillip Susi <psusi@....rr.com>,
Linus Torvalds <torvalds@...l.org>, Viktor <vvp01@...ox.ru>,
Aubrey <aubreylee@...il.com>, Hua Zhong <hzhong@...il.com>,
Hugh Dickins <hugh@...itas.com>, linux-kernel@...r.kernel.org,
hch@...radead.org, kenneth.w.chen@in
Subject: Re: O_DIRECT question
Denis Vlasenko wrote:
> On Saturday 27 January 2007 15:01, Bodo Eggert wrote:
>> Denis Vlasenko <vda.linux@...glemail.com> wrote:
>>> On Friday 26 January 2007 19:23, Bill Davidsen wrote:
>>>> Denis Vlasenko wrote:
>>>>> On Thursday 25 January 2007 21:45, Michael Tokarev wrote:
>>>>>> But even single-threaded I/O but in large quantities benefits from
>>>>>> O_DIRECT significantly, and I pointed this out before.
>>>>> Which shouldn't be true. There is no fundamental reason why
>>>>> ordinary writes should be slower than O_DIRECT.
>>>>>
>>>> Other than the copy to buffer taking CPU and memory resources.
>>> It is not required by any standard that I know. Kernel can be smarter
>>> and avoid that if it can.
>> The kernel can also solve the halting problem if it can.
>>
>> Do you really think an entropy estamination code on all access patterns in the
>> system will be free as in beer,
>
> Actually I think we need this heuristic:
>
> if (opened_with_O_STREAM && buffer_is_aligned
> && io_size_is_a_multiple_of_sectorsize)
> do_IO_directly_to_user_buffer_without_memcpy
>
> is not *that* compilcated.
>
> I think that we can get rid of O_DIRECT peculiar requirements
> "you *must* not cache me" + "you *must* write me directly to bare metal"
> by replacing it with O_STREAM ("*advice* to not cache me") + O_SYNC
> ("write() should return only when data is written to storage, not sooner").
>
> Why?
>
> Because these O_DIRECT "musts" are rather unusual and overkill. Apps
> should not have that much control over what kernel does internally;
> and also O_DIRECT was mixing shampoo and conditioner on one bottle
> (no-cache and sync writes) - bad API.
What a shame that other operating systems can manage to really support
O_DIRECT, and that major application software can use this api to write
portable code that works even on Windows.
You overlooked the problem that applications using this api assume that
reads are on bare metal as well, how do you address the case where
thread A does a write, thread B does a read? If you give thread B data
from a buffer and it then does a write to another file (which completes
before the write from thread A), and then the system crashes, you have
just put the files out of sync. So you may have to block all i/o for all
threads of the application to be sure that doesn't happen. Or introduce
some complex way to assure that all writes are physically done in
order... that sounds like a lock infested mess to me, assuming that you
could ever do it right.
Oracle has their own version of Linux now, do you think that they would
fork the application or the kernel?
--
Bill Davidsen <davidsen@....com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists