[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070223111245.GF22870@elte.hu>
Date: Fri, 23 Feb 2007 12:12:45 +0100
From: Ingo Molnar <mingo@...e.hu>
To: David Miller <davem@...emloft.net>
Cc: johnpol@....mipt.ru, arjan@...radead.org, drepper@...hat.com,
linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
hch@...radead.org, akpm@....com.au, alan@...rguk.ukuu.org.uk,
zach.brown@...cle.com, suparna@...ibm.com, davidel@...ilserver.org,
jens.axboe@...cle.com, tglx@...utronix.de
Subject: Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3
* David Miller <davem@...emloft.net> wrote:
> > furthermore, in a real webserver there's a whole lot of other stuff
> > happening too: VFS blocking, mutex/lock blocking, memory pressure
> > blocking, filesystem blocking, etc., etc. Threadlets/syslets cover
> > them /all/ and never hold up the primary context: as long as there's
> > more requests to process, they will be processed. Plus other
> > important networked workloads, like fileservers are typically on
> > fast LANs and those requests are very much a fire-and-forget matter
> > most of the time.
>
> I expect clients of a fileserver to cause the server to block in
> places such as tcp_sendmsg() as much if not more so than a webserver
> :-)
yeah, true. But ... i'd still like to midly disagree with the
characterisation that due to blocking being the norm in networking, that
this goes against the concept of syslets/threadlets.
Networking has a 'work cache' too, in an abstract sense, which works in
favor of syslets: the socket buffers. If there is a reasonably sized
output queue for sockets (not extremely small like 4k per socket but
something at least 16k), then user-space can chunk up its workload along
pretty reasonable lines without assuming too much, and turn one such
'chunk' into one atomic step done in the 'syslet/threadlet request'. In
the rare cases where blocking happens in an unexpected way, the
syslet/threadlet 'goes async' but that only holds up that particular
request, and only for that chunk, not the main loop of processing and
doesnt force the request into an async thread forever.
the kevent model is very much about: /never every/ block the main
thread. If you ever block, performance goes down the drain.
the syslet performance model is: if you block less than say 10-20% of
the time, you are basically as fast as the most extreme kevents based
model. Syslets/threadlets also avoids the fundamental workflow problem
that nonblocking designs have in a natural way: 'how do we guarantee
that the system progresses forward', which problem makes nonblocking
code quite fragile.
another property is that the performance curve is alot less sensitive to
blocking in the syslet model - and real user-space servers will always
have unexpected points of blockage - unless all of userspace code is
perfectly converted into state machines - which is not reasonable. So
with syslets we are not forced to program everything as state-machines
in user-space, such techniques are only needed to reduce the amount of
context-switching and to reduce the RAM footprint - they wont change
fundamental scalability.
plus there's the hidden advantage of having a 'constructed state' on the
kernel stack: thread that blocks in the middle of tcp_sendmsg() has
quite some state: the socket has been looked up in the hash(es), all
input parameters have been validated, the timer has been set, skbs have
been allocated ahead, etc. Those things do add up especially if you are
after a long wait and all those things are scattered around the memory
map cache-cold - not nicely composed into a single block of memory on
the stack (generating only a single cachemiss in essence).
All in one, i'm cautiosly optimistic that even a totally naive,
blocks-itself-for-every-request syslet application would perform pretty
close to a Tux/kevents type of nonblock+event-queueing based application
- with a vastly higher generic utility benefit. So i'm not dismissing
this possibility and i'm not writing off syslet performance just because
they do context-switches :-)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists