lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:  <45BCC17A.9090302@tmr.com>
Date:	Sun, 28 Jan 2007 10:30:02 -0500
From:	Bill Davidsen <davidsen@....com>
To:	linux-kernel@...r.kernel.org
Cc:	7eggert@....de, Michael Tokarev <mjt@....msk.ru>,
	Phillip Susi <psusi@....rr.com>,
	Linus Torvalds <torvalds@...l.org>, Viktor <vvp01@...ox.ru>,
	Aubrey <aubreylee@...il.com>, Hua Zhong <hzhong@...il.com>,
	Hugh Dickins <hugh@...itas.com>, linux-kernel@...r.kernel.org,
	hch@...radead.org, kenneth.w.chen@in
Subject:  Re: O_DIRECT question

Denis Vlasenko wrote:
> On Saturday 27 January 2007 15:01, Bodo Eggert wrote:
>> Denis Vlasenko <vda.linux@...glemail.com> wrote:
>>> On Friday 26 January 2007 19:23, Bill Davidsen wrote:
>>>> Denis Vlasenko wrote:
>>>>> On Thursday 25 January 2007 21:45, Michael Tokarev wrote:
>>>>>> But even single-threaded I/O but in large quantities benefits from
>>>>>> O_DIRECT significantly, and I pointed this out before.
>>>>> Which shouldn't be true. There is no fundamental reason why
>>>>> ordinary writes should be slower than O_DIRECT.
>>>>>
>>>> Other than the copy to buffer taking CPU and memory resources.
>>> It is not required by any standard that I know. Kernel can be smarter
>>> and avoid that if it can.
>> The kernel can also solve the halting problem if it can.
>>
>> Do you really think an entropy estamination code on all access patterns in the
>> system will be free as in beer,
> 
> Actually I think we need this heuristic:
> 
> if (opened_with_O_STREAM && buffer_is_aligned
> 		&& io_size_is_a_multiple_of_sectorsize)
> 	do_IO_directly_to_user_buffer_without_memcpy
> 
> is not *that* compilcated.
> 
> I think that we can get rid of O_DIRECT peculiar requirements
> "you *must* not cache me" + "you *must* write me directly to bare metal"
> by replacing it with O_STREAM ("*advice* to not cache me") + O_SYNC
> ("write() should return only when data is written to storage, not sooner").
> 
> Why?
> 
> Because these O_DIRECT "musts" are rather unusual and overkill. Apps
> should not have that much control over what kernel does internally;
> and also O_DIRECT was mixing shampoo and conditioner on one bottle
> (no-cache and sync writes) - bad API.

What a shame that other operating systems can manage to really support 
O_DIRECT, and that major application software can use this api to write 
portable code that works even on Windows.

You overlooked the problem that applications using this api assume that 
reads are on bare metal as well, how do you address the case where 
thread A does a write, thread B does a read? If you give thread B data 
from a buffer and it then does a write to another file (which completes 
before the write from thread A), and then the system crashes, you have 
just put the files out of sync. So you may have to block all i/o for all 
threads of the application to be sure that doesn't happen. Or introduce 
some complex way to assure that all writes are physically done in 
order... that sounds like a lock infested mess to me, assuming that you 
could ever do it right.

Oracle has their own version of Linux now, do you think that they would 
fork the application or the kernel?

-- 
Bill Davidsen <davidsen@....com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ