lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150423062438.GB28898@gmail.com>
Date:	Thu, 23 Apr 2015 08:24:38 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Josh Triplett <josh@...htriplett.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andy Lutomirski <luto@...capital.net>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Borislav Petkov <bp@...en8.de>,
	"H. Peter Anvin" <hpa@...or.com>, Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Will Drewry <wad@...omium.org>,
	Kees Cook <keescook@...omium.org>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 2/2] x86/asm/entry/32: Remove unnecessary optimization in
 stub32_clone


* Josh Triplett <josh@...htriplett.org> wrote:

> On Wed, Apr 22, 2015 at 11:22:02AM -0700, Linus Torvalds wrote:
> > On Wed, Apr 22, 2015 at 10:10 AM, Josh Triplett <josh@...htriplett.org> wrote:
> > >
> > > I do think my two-patch HAVE_COPY_THREAD_TLS series should go in fixing
> > > this
> > 
> > Ugh, I absolutely detesrt that patch.
> > 
> > Don't make random crazy function signatures that depend on some config
> > option. That's just evil. The patch is a mess of #ifdef's and should
> > be shot in the head and staked with a silver stake to  make sure it
> > never re-appears.
> > 
> > Either:
> > 
> >  (a) make the change for every architecture
> > 
> >  (b) have side-by-side interfaces. With different names!
> 
> ...that's exactly what I did.  They're called copy_thread and 
> copy_thread_tls; I very intentionally did not conditionally change 
> the signature of copy_thread, for exactly that reason.  Those 
> functions are implemented in architecture-specific code, so the 
> config option just specifies which of the two functions the 
> architecture provides.
> 
> *sys_clone* has different function signatures based on config 
> options, but I didn't touch that other than fixing the type of the 
> tls argument. That's historical baggage that we can't throw away 
> without breaking userspace.

So you want to add a new clone() parameter. I strongly suspect that 
you won't be the last person who wants to do this.

So why not leave the compatibility baggage alone and introduce a new 
clean clone syscall with a flexible, backwards and forwards compatible 
parameter ABI?

Something like:

  SYSCALL_DEFINE1(clone_params, struct clone_params __user *, params)

  struct clone_params {
	__u32 size;	/* User-space sets it to sizeof(struct clone_params) */

	__u32 tls_val;	/* Slightly out of order, for optimal structure packing */
	__u64 clone_flags;
	__u64 new_sp_addr;
	__u64 parent_tid_addr;
	__u64 child_tid_addr;
	__u64 tls;
  };

The only real cost of this approach is that this parameter structure 
has to be copied (once): should be fast as it fits into a single cache 
line and is a constant size copy. Also note how this parameter block 
can be passed down by const reference from that point on, instead of 
copying/shuffling 5-6 parameters into increasingly long function 
parameter lists. So it might in fact turn out to be slightly faster as 
well.

Note how easily extensible it is: a new field can be added by 
appending to the structure, and compatibility is achieved in the 
kernel without explicit versioning, by checking params->size:

  params->size == sizeof(*params):

     Good, kernel and userspace ABI version matches. This is a simple
     check and the most typical situation - it will be the fastest as
     well.


  params->size < sizeof(*params):

     Old binary calls into new kernel. Missing fields are set to 0.
     Binaries will be forward compatible without any versioning hacks.


  params->size > sizeof(*params):

     New user-space calls into old kernel. System call can still be
     serviced if the new field(s) are all zero. We return -ENOSYS if 
     any field is non-zero. (i.e. if new user-space tries to make use
     of a new syscall feature that this kernel has not implemented
     yet.) This way user-space can figure out whether a particular
     new parameter is supported by a kernel, without having to add new
     system calls or versioning.

Also note how 'fool proof' this kind of 'versioning' is: the parameter 
block is extended by appending to the end, cleanly extending the 
kernel internals to handle the new parameter - end of story. The 
zeroing logic on size mismatch will take care of the rest 
automatically, it's all ABI forwards and backwards compatible to the 
maximum possible extent.

It's relatively easy to use this same interface from 32-bit compat 
environments as well: they can use the very same parameter block, the 
kernel's compat layer possibly just needs to do a minor amount of 
pointer translation for the _addr fields, for example to truncate them 
down to 32 bits for security, by filling in the high words of pointers 
with 0. (This too can be optimized further if needed, by open coding 
the copy and zeroing.) This too is forwards and backwards ABI 
compatible to the maximum possible extent.

So introduce a single new syscall with the right ABI architecture and 
you can leave the old mistakes alone.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ