lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAK6Zt0C_cuTP3U2AdRVdhAQqA04O8y1TqaikA+egJV1jYbgYg@mail.gmail.com>
Date:	Wed, 31 Aug 2011 16:59:26 -0700
From:	Daniel Ehrenberg <dehrenberg@...gle.com>
To:	guy keren <choo@...com.co.il>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jens Axboe <axboe@...nel.dk>, Jeff Moyer <jmoyer@...hat.com>,
	linux-kernel@...r.kernel.org, linux-aio@...ck.org
Subject: Re: Approaches to making io_submit not block

On Wed, Aug 31, 2011 at 4:48 PM, guy keren <choo@...com.co.il> wrote:
> On Wed, 2011-08-31 at 16:16 -0700, Daniel Ehrenberg wrote:
>> On Tue, Aug 30, 2011 at 11:04 PM, guy keren <choo@...com.co.il> wrote:
>> > On Tue, 2011-08-30 at 15:54 -0700, Andrew Morton wrote:
>> >> On Tue, 30 Aug 2011 15:45:35 -0700
>> >> Daniel Ehrenberg <dehrenberg@...gle.com> wrote:
>> >>
>> >> > >> Not quite sure, and after working on them and fixing thing up, I don't
>> >> > >> even think they are that complex or intrusive (which I think otherwise
>> >> > >> would've been the main objection). Andrew may know/remember.
>> >> > >
>> >> > > Boy, that was a long time ago. __I was always unhappy with the patches
>> >> > > because of the amount of additional code/complexity they added.
>> >> > >
>> >> > > Then the great syslets/threadlets design session happened and it was
>> >> > > expected that such a facility would make special async handling for AIO
>> >> > > unnecessary. __Then syslets/threadlets didn't happen.
>> >> >
>> >> > Do you think we could accomplish the goals with less additional
>> >> > code/complexity? It looks like the latest version of the patch set
>> >> > wasn't so invasive.
>> >> >
>> >> > If syslets/threadlets aren't happening, should these patches be
>> >> > reconsidered for inclusion in the kernel?
>> >>
>> >> I haven't seen any demand at all for the feature in many years.  That
>> >> doesn't mean that there _isn't_ any demand - perhaps everyone got
>> >> exhausted.
>> >
>> > you should consider the emerging enterprise-grade SSD devices - which
>> > can serve several tens of thousands of I/O requests per device actually
>> > controller). These devices could be better utilized by better
>> > interfaces. further more, in our company we had to resort to using
>> > windows for IOPS benchmarking (using iometer) against storage systems
>> > using these (and similar) devices, because it manages to generate higher
>> > IOPS then linux can (i don't remember the exact numbers, but we are
>> > talking about an order of several hundred thousands IOPS).
>> >
>> > It could be that we are currently an esoteric use-case - but the
>> > high-end performance market seems to be stepping in that direction.
>>
>> I'm interested in SSD performance too. Could you tell me more about
>> your use case? Were you using a file system or a raw block device? The
>> patches we're discussing don't have any effect on a raw block device.
>
> well, the use case i've discussed specifically was with raw devices -
> not file systems.
>
> for file systems info - i'll have to consult the people that were
> running benchmarks at our work place.
>
>> Do you have any particular ideas about a new interface? What does
>> Windows provide that Linux lacks that's relevant here?
>
> i don't know what exactly it provides that linux does not - basically,it
> provides a similar asynchronous I/O API (using a mechanism they call
> "completion ports") - it just seems that they have a faster
> implementation (we compare execution on the same box, with 8GBps
> fiber-channel connections, and when we are comparing IOPS - not
> bandwidth nor latency. the storage device is the product that we
> manufacture - which is based on DRAM for storage).
>
> i can't tell you what's the specific part that causes the performance
> differences - the AIO implementation, the multi-path driver or something
> else.
>
> internally inside the box, we had problems when attempting to recover
> after a disconnection - back when we used iscsi as our internal
> transport. we stopped using it - so this is not relevant for us - but
> the phenomena we saw was that at certain times, when we had many (a few
> tens) of AIO operations to perform at once - it could take several
> seconds just to send them all (i'm not talking about completion). this
> was when we used the POSIX API on top of linux's AIO implementation
> (i.e. using librtkaio - not using the user-space implementation of
> glibc).

I'm just as interested in improving the performance of the raw block
device as I am of the file system. Any more details you could give me
about this would be great. You're saying io_submit on a raw block
device blocked for tens of seconds? Did your POSIX AIO implementation
make sure not to overrun the queue length established in io_setup?
Could you provide the test code you used? Do you have function-level
CPU profiles available?
>
>> >
>> >> If there is demand then that should be described and circulated, see
>> >> how much interest there is in resurrecting the effort.
>> >>
>> >> And, of course, the patches should be dragged out and looked at - it's
>> >> been a number of years now.
>> >>
>> >> Also, glibc has userspace for POSIX AIO.  A successful kernel-based
>> >> implementation would result in glibc migrating away from its current
>> >> implementation.  So we should work with the glibc developers on ensuring
>> >> that the migration can happen.
>> >
>> > glibc's userspace implementation doesn't scale to fast devices. It could
>> > make sense when working with slower disk devices - not when you're
>> > working with solid-state storage devices.
>> >
>> > --guy
>> >
>> >
>>
>> Dan
>
> --guy
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ