lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070530090537.GB17744@elte.hu>
Date:	Wed, 30 May 2007 11:05:37 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Evgeniy Polyakov <johnpol@....mipt.ru>
Cc:	Ulrich Drepper <drepper@...hat.com>, Jeff Garzik <jeff@...zik.org>,
	Zach Brown <zach.brown@...cle.com>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>,
	Christoph Hellwig <hch@...radead.org>,
	Andrew Morton <akpm@....com.au>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	"David S. Miller" <davem@...emloft.net>,
	Suparna Bhattacharya <suparna@...ibm.com>,
	Davide Libenzi <davidel@...ilserver.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Syslets, Threadlets, generic AIO support, v6


* Evgeniy Polyakov <johnpol@....mipt.ru> wrote:

> On Wed, May 30, 2007 at 10:42:52AM +0200, Ingo Molnar (mingo@...e.hu) wrote:
> > it is a serious flexibility issue that should not be ignored. The 
> > unified fd space is a blessing on one hand because it's simple and 
> > powerful, but it's also a curse because nested use of the fd space for 
> > libraries is currently not possible. But it should be detached from any
> > fundamental question of kevent vs. epoll. (By improving library use of
> > file descriptors we'll improve the utility of all syscalls - by ducking
> > to a memory based API we only solve that particular event based usage.)
> 
> There is another issue with file descriptors - userspace must dig into 
> kernel each time it wants to get a new set of events, while with 
> memory based approach it has them without doing so. After it has 
> returned from kernel and know that there are some evetns, kernel can 
> add more of them into the ring (if there is a place) and userspace 
> will process them withouth additional syscalls.

Firstly, this is not a fundamental property of epoll. If we wanted to, 
it would be possible to extend epoll to fill in a ring of events from 
the wakeup handler. It's an incremental add-on to epoll that should not 
impact the design. How much info to put into a single event is another 
incremental thing - for most of the high-performance cases all the 
information we need is the type of the event and the fd it occured on. 
Currently epoll supports that minimal approach.

Secondly, our current syscall overhead is below 0.1 usecs on latest 
hardware:

  dione:~/l> ./lat_syscall null
  Simple syscall: 0.0911 microseconds

so you need millions of events _per cpu_ for the syscall overhead to 
show up.

Thirdly, our main problem was not the structure of epoll, our main 
problem was that event APIs were not widely available, so applications 
couldnt go to a pure event based design - they always had to handle 
certain types of event domains specially, due to lack of coverage. The
latest epoll patches largely address that. This was a huge barrier
against adoption of epoll.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ