linux-kernel - Re: [PATCHv4 5/6] Allow setting O

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.0.9999.0711261003290.5869@woody.linux-foundation.org>
Date:	Mon, 26 Nov 2007 10:17:58 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"H. Peter Anvin" <hpa@...or.com>
cc:	Ulrich Drepper <drepper@...hat.com>,
	David Miller <davem@...emloft.net>,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	mingo@...e.hu, tglx@...utronix.de
Subject: Re: [PATCHv4 5/6] Allow setting O_NONBLOCK flag for new sockets

On Tue, 20 Nov 2007, H. Peter Anvin wrote:
> 
> If the whole thing about "a dozen new [system calls]" then a dozen system
> calls added to the existing tables are better than this mess.

No it's not.

The point about the indirect calls is that we can do it for other things 
than just a dozen random things that wants this one flag.

We'll eventually want AIO calls for filename lookup etc, for example. 
That's another dozen calls (stat, lstat, open, etc). Having an indirect 
call interface to do these kinds of things would be wonderful, instead of 
having to add new system calls every time some issue with a flag that 
changes behaviour for an already existing system call comes up.

THAT is why I'd much rather have indirect system calls. 

The actual calling convention details are open for debate, of course. We 
could encode the information in the system call number itself, for example 
(eg have a bit there that says "extended information"). But we'll never 
get away from the fact that we have the odd architecture-specific system 
call interfaces with things like "pselect()" having pointers etc, if only 
because of legacy issues.

So we can *never* have a truly "generic" argument marshalling setup. We'll 
have to live with each architecture having system calls with special 
rules: some of those rules will be architecture-specific (eg number of 
easily available registers and/or historical reasons), and a few of the 
rules will be architecture-independent (eg things like sigreturn, clone 
and execve, that need to have direct access to the whole kernel return 
stack and simply *cannot* be called from any indirect code!)

So the choice is basically one of:

 - come up with a totally new interface to system calls, and effectively 
   duplicating the whole system call table.

   I'd hate to do this. We already have duplicated system call tables due 
   to compat stuff, it's painful.

 - just emulate the *existing* interface exactly, but with indirection. 
   IOW, the system call interface on x86 an unconditional "six words in 
   six registers, the meaning of which is totally up to the system call 
   implementation itself".

   This is what Uli's sys_indirect() does.

 - add whole new system calls with extended information, making the 6-word 
   limits even worse, and likely forcing a whole new argument marshalling 
   code with conditionals depending on per-system-call flags, which 
   further complicates it and slows things down.

Quite frankly, I can't really see many other approaches. And of the above 
three ones, the sys_indirect() approach really does seem to be the 
simplest *and* the best-performing. It's basically faster to just 
unconditionally load six registers off an indirect block than it would be 
to have any conditionals based on which system call it is.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/