[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0702141223270.20368@woody.linux-foundation.org>
Date: Wed, 14 Feb 2007 12:38:16 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@....com.au>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Ulrich Drepper <drepper@...hat.com>,
Zach Brown <zach.brown@...cle.com>,
Evgeniy Polyakov <johnpol@....mipt.ru>,
"David S. Miller" <davem@...emloft.net>,
Benjamin LaHaise <bcrl@...ck.org>,
Suparna Bhattacharya <suparna@...ibm.com>,
Davide Libenzi <davidel@...ilserver.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 05/11] syslets: core code
On Tue, 13 Feb 2007, Ingo Molnar wrote:
>
> the core syslet / async system calls infrastructure code.
Ok, having now looked at it more, I can say:
- I hate it.
I dislike it intensely, because it's so _close_ to being usable. But the
programming interface looks absolutely horrid for any "casual" use, and
while the loops etc look like fun, I think they are likely to be less than
useful in practice. Yeah, you can do the "setup and teardown" just once,
but it ends up being "once per user", and it ends up being a lot of stuff
to do for somebody who wants to just do some simple async stuff.
And the whole "lock things down in memory" approach is bad. It's doing
expensive things like mlock(), making the overhead for _single_ system
calls much more expensive. Since I don't actually believe that the
non-single case is even all that interesting, I really don't like it.
I think it's clever and potentially useful to allow user mode to see the
data structures (and even allow user mode to *modify* them) while the
async thing is running, but it really seems to be a case of excessive
cleverness.
For example, how would you use this to emulate the *current* aio_read()
etc interfaces that don't have any user-level component except for the
actual call? And if you can't do that, the whole exercise is pointless.
Or how would you do the trivial example loop that I explained was a good
idea:
struct one_entry *prev = NULL;
struct dirent *de;
while ((de = readdir(dir)) != NULL) {
struct one_entry *entry = malloc(..);
/* Add it to the list, fill in the name */
entry->next = prev;
prev = entry;
strcpy(entry->name, de->d_name);
/* Do the stat lookup async */
async_stat(de->d_name, &entry->stat_buf);
}
wait_for_async();
.. Ta-daa! All done ..
Notice? This also "chains system calls together", but it does it using a
*much* more powerful entity called "user space". That's what user space
is. And yeah, it's a pretty complex sequencer, but happily we have
hardware support for accelerating it to the point that the kernel never
even needs to care.
The above is a *realistic* schenario, where you actually have things like
memory allocation etc going on. In contrast, just chaining system calls
together isn't a realistic schenario at all.
So I think we have one _known_ usage schenario:
- replacing the _existing_ aio_read() etc system calls (with not just
existing semantics, but actually binary-compatible)
- simple code use where people are willing to perhaps do something
Linux-specific, but because it's so _simple_, they'll do it.
In neither case does the "chaining atoms together" seem to really solve
the problem. It's clever, but it's not what people would actually do.
And yes, you can hide things like that behind an abstraction library, but
once you start doing that, I've got three questions for you:
- what's the point?
- we're adding overhead, so how are we getting it back
- how do we handle independent libraries each doing their own thing and
version skew between them?
In other words, the "let user space sort out the complexity" is not a good
answer. It just means that the interface is badly designed.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists