lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 26 Feb 2007 13:22:34 -0800 (PST)
From:	Davide Libenzi <davidel@...ilserver.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Evgeniy Polyakov <johnpol@....mipt.ru>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>,
	Christoph Hellwig <hch@...radead.org>,
	Andrew Morton <akpm@....com.au>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Ulrich Drepper <drepper@...hat.com>,
	Zach Brown <zach.brown@...cle.com>,
	"David S. Miller" <davem@...emloft.net>,
	Suparna Bhattacharya <suparna@...ibm.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: threadlets as 'naive pool of threads', epoll, some measurements

On Mon, 26 Feb 2007, Ingo Molnar wrote:

> 
> update:
> 
> > i have tried the one Evgeniy provided in the URL:
> > 
> >   http://tservice.net.ru/~s0mbre/archive/kevent/evserver_epoll.c
> > 
> > and 'ab -k -c8000 -n80000' almost always aborts with:
> > 
> >   apr_socket_recv: Connection reset by peer (104)
> > 
> > in the few cases it finishes, i got the following epoll result, over 
> > gigabit ethernet, on an UP Athlon64 box:
> > 
> >    eserver_epoll:     7800 reqs/sec
> 
> eserver_epoll.c had a number of bugs. The most serious one was the 
> apparently buggy use of "EPOLLET" (edge-triggered events). Removing that 
> and moving epoll to level-triggered (which is slower but does not result 
> in missed events) gives:
> 
>    eserver_epoll:       9400 reqs/sec
> 
> > the same with the most naive implementation of the same, using 
> > threadlets:
> > 
> >    eserver_threadlet: 5800 reqs/sec
> 
>    eserver_epoll_threadlet:   9400 reqs/sec
> 
> as expected, the level of extra blocking triggered by this is low - even 
> if the full request function runs without nonblock assumptions.

That looks pretty good. I started (spare time allowing) laying down the 
based for a more realistic test as far webserver-like benchmarking.
I want to compare three solutions, that uses the same internal code (as 
far as request parsing and content delivery).

1) A fully thread classical web "server". This does the trivial 
   accept+pthread_create and in there does open+fstat+sendhdrs+sendfile. 
   This is already done:

   http://www.xmailserver.org/thrhttp.c

   Do a `gcc -o thrhttp thrhttp.c -lpthread` and you're done.

2) A coronet (coroutine+epoll library) handling for network 
   events/dispatch, plus GUASI (Generic Userspace Asyncronous Syscall
   Interface) to handle generic IO.

   libpcl:   http://www.xmailserver.org/libpcl.html
   coronet:  http://www.xmailserver.org/coronet-lib.html
   GUASI:    http://www.xmailserver.org/guasi-lib.html

   cghttpd:  http://www.xmailserver.org/cghttpd-home.html

   All these are configure+make+make_install installable.
   The cghttpd server has the same parsing/content-delivery features of 
   thrhttp, but it uses coroutine+epoll to handle network events, and 
   GUASI (an old pthread-based userspace implementation of async execution)
   to give open/fstat/sendfile an async behaviour. It implements the 
   epoll_wait() (conet_events_wait() acutally, but that maps directly to 
   epoll_wait()) hosted inside an async GUASI request, and the GUASI 
   dispatch loop handle the case with:

   if (cookie == conet_events_wait_cookie)
       handle_network_events();

3) Finally, a very similar to cghttpd implementation, but this time using 
   the *syslets* to handle async requests. That should be a prety easy 
   change to cghttpd, by the mean of replacing the CGHTTPD_SYSCALL() macro 
   with proper syslet code. The big advantage of the syslets here, is that 
   they do have the cachehit optimization, while GUASI always have to 
   trigger a queueing.
   Of course, since the only machine I have with enough RAM to keep many 
   thousands of session active is a dual Opteron, I'd need to have an 
   x86-64 version of the patch. That shouldn't be a big problem though.


This hopefully may prove two points. First, a fully threaded solution does 
not scale well when dealing with thousands and thousands of sessions. 
Second, the cachehit syslet trick is The Man in the syslet code, and kicks 
userspace solution #2 ass.
In the meantime, I think Jens tests are more meaningfull, as far of usage 
field goes, than any network based test.




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ