[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070214111035.GB32612@2ka.mipt.ru>
Date: Wed, 14 Feb 2007 14:10:35 +0300
From: Evgeniy Polyakov <johnpol@....mipt.ru>
To: Ingo Molnar <mingo@...e.hu>
Cc: Benjamin LaHaise <bcrl@...ck.org>, Alan <alan@...rguk.ukuu.org.uk>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@....com.au>,
Ulrich Drepper <drepper@...hat.com>,
Zach Brown <zach.brown@...cle.com>,
"David S. Miller" <davem@...emloft.net>,
Suparna Bhattacharya <suparna@...ibm.com>,
Davide Libenzi <davidel@...ilserver.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 00/11] ANNOUNCE: "Syslets", generic asynchronous system call support
On Wed, Feb 14, 2007 at 11:37:31AM +0100, Ingo Molnar (mingo@...e.hu) wrote:
> > Let me clarify what I meant. There is only limited number of threads,
> > which are supposed to execute blocking context, so when all they are
> > used, main one will block too - I asked about possibility to reuse the
> > same thread to execute queue of requests attached to it, each request
> > can block, but if blocking issue is removed, it would be possible to
> > return.
>
> ah, ok, i understand your point. This is not quite possible: the
> cachemisses are driven from schedule(), which can be arbitraily deep
> inside arbitrary system calls. It can be in a mutex_lock() deep inside a
> driver. It can be due to a alloc_pages() call done by a kmalloc() call
> done from within ext3, which was called from the loopback block driver,
> which was called from XFS, which was called from a VFS syscall.
That's only because of schedule() is a main point where
'rescheduling'/requeuing (task switch in other words) happens - but if
it will be possible to bypass schedule()'s decision and not reschedule
there, but 'on demand', will it be possible to reuse the same syslet?
Let me show an example:
consider aio_sendfile() on a big file, so it is not possible to fully
get it into VFS, but having spinning on per-page basis (like right now)
is no optial solution too. For kevent AIO I created new address space
operation aio_getpages() which is essentially mpage_readpages() - it
populates several pages into VFS in one BIO (if possible, otherwise in
the smallest possible number of chunks) and then in bio destruction
callback (actually in bio_endio callback, but for that case it can be
considered as the same) I reschedule the same request to some other (not
exactly the same as started) thread. When processed data is being sent
and next chunk of the file is populated to the VFS using aio_getpages(),
which in BIO callback will reschedule the same request again.
So it is possible with essentially one thread (or limited number of
them) to fill the whole IO pipe.
With syslet approach it seems to be impossible due to the fact, that
request is a whole sendfile. Even if one uses proper readahed (fadvise)
advise, there is no possibility to split sendfile and form it as a set
of essentially the same requests with different start/offset/whatever
parameters (well, exactly for senfile() it is possible - just setup
several calls in one syslet from different offsets and with different
lengths and form a proper state machine of them, but for example TCP
recv() will not match that scenario).
So my main question was about possibility to reuse syslet state machine
in kevent AIO instead of own (althtough own one lacks only one good
feature of syslets threads currently - its set of threads is global,
but not per-task, which does not allow to scale good with number of
different processes doing IO) so to not duplicate the code if kevent is
ever be possible to get into.
--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists