[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47436D07.2070203@zytor.com>
Date: Tue, 20 Nov 2007 15:25:59 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Zach Brown <zach.brown@...cle.com>,
Ulrich Drepper <drepper@...hat.com>,
David Miller <davem@...emloft.net>,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
tglx@...utronix.de, torvalds@...ux-foundation.org
Subject: Re: [PATCHv4 5/6] Allow setting O_NONBLOCK flag for new sockets
Ingo Molnar wrote:
> do you suggest that extending the system call calling convention to
> include an arbitrary number of parameters will solve all these API needs
> we have at the moment?
>
> if yes, then a one-shot syslet/async call would in essence be:
>
> syslet_arg1 ... N, syscall_arg 1 ... M
>
> the same is true for the indirect stuff, we in essence nest syscalls
> inside another syscall:
>
> sys_indirect arg1 ... N, syscall arg 1 ... M
>
> this all assumes an arbitrarily extendable syscall ABI, which can take
> N+M parameters. Right?
>
> i'm not entirely sure we really want to do this. Nested syscalls would
> have to unpack the arguments and repack them into a kernel-internal call
> format anyway. So there's no performance upside - in fact i can only see
> additional complications.
Forget about indirection for the moment. Let's first look at the need
of plain system calls.
> Why not just pin down the current ABI that there's 6 syscall parameters
> _and not more_?
Because we have already violated it. There are system calls that need
more than 6 arguments: we need *a* convention. Worse, we're not
actually talking 6 *arguments*, we're talking 6 *words*; on 32-bit
platforms a single argument can occupy two words.
Uli talks about the need to adding additional system calls with
parameters, and suggests "back-dooring" them via the sys_indirect interface.
pselect introduced the convention that to take more than 6 arguments,
the 6th argument register contains a pointer into a user-space memory
area which contains the real arguments 6 and above. This is a simple
convention, which can be trivially executed as a rule set. Furthermore,
*with some care* it can be mapped 1:1 onto the C calling convention by
system-call-generic code, as opposed to needing system-call-specific
stubs to marshall parameters.
Now, if you execute that asynchronously, you of course need to make sure
that userspace doesn't clobber those additional arguments, so you would
have to save them away or otherwise restrict userspace from reclaiming this.
** This is the situation as it stands today, and any solution needs to
take this into account. **
> It's totally sensible, and indirection has some minimal
> costs anyway, so copying the nested syscall parameters is a non-issue.
> This is not ad-hoc and when i wrote syslets i actually profiled the
> performance a variable-length calling convention and decided _against_
> it. Nothing beats the performance of a straight fixd-length copy of 6x4
> (or 6x8) bytes.
>
> The memory access cost argument you mentioned is largely irrelevant and
> inapposite here, this is all passed in on the stack which is well-cached
> in the L1 cache.
What I'm objecting to, strongly, is the use of this syslets-style
indirection for unrelated purposes, such as modifying the behaviour of
existing system calls. The sys_indirect call, as far as I understand
it, basically is a way to inject commands into a separate
hyper-lightweight thread of kernel execution. That's fine so far.
However, proposing that we should have system calls (call a spade a
spade) which can ONLY be accessed via this indirection interface is bad
interface design at best and something much stronger at worst. What we
have done in the past when we want to add new parameters to a system
call is that we assign it a new system call number, and point the old
system call number to a thunk which sets the new parameters to specific
default values and then tailcalls the new system call. This is a very
straightforward thing to do, and imposes any costs at all only on users
of the legacy system call number -- and then they are only a handful of
instructions.
-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists