lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 17 Oct 2012 16:16:32 +1000
From:	Greg Ungerer <gerg@...pgear.com>
To:	Al Viro <viro@...IV.linux.org.uk>
CC:	<linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	<linux-arch@...r.kernel.org>, David Miller <davem@...emloft.net>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Subject: Re: [RFC][CFT][CFReview] execve and kernel_thread unification work

Hi Al,

On 15/10/12 11:30, Al Viro wrote:
> On Mon, Oct 01, 2012 at 10:38:09PM +0100, Al Viro wrote:
>> [with apologies for folks Cc'd, resent due to mis-autoexpanded l-k address
>> on the original posting ;-/  Mea culpa...]
>>
>> 	There's an interesting ongoing project around kernel_thread() and
>> friends, including execve() variants.  I really need help from architecture
>> maintainers on that one; I'd been able to handle (and test) quite a few
>> architectures on my own [alpha, arm, m68k, powerpc, s390, sparc, x86, um]
>> plus two more untested [frv, mn10300].  c6x patches had been supplied by
>> Mark Salter; everything else remains to be done.  Right now it's at
>> minus 1.2KLoC, quite a bit of that removed from asm glue and other black magic.
>
> Update:
> 	* all infrastructure is in mainline now, along with conversion for
> kernel_thread() callbacks to the form that allows really simple model for
> kernel_execve() _without_ flagday changes.
> 	* #experimental-kernel_thread is gone; this stuff is in for-next
> now.
> 	* a lot of architecture conversions had been done and some are
> even tested.  Currently missing are only 7 - avr32, hexagon, m32r, openrisc,
> score, tile and xtensa.  OTOH, a lot are completely untested.  I've put
> per-architecture stuff into separate branches and I promise never rebase
> those once arch maintainers will be OK with the stuff in them.  IOW, they'll
> be safe to pull into respective architecture trees.
>
> Folks, *please* review the stuff in signal.git#arch-*.  All of them are
> completely independent.  I'll be glad to get ACKs/fixes/replacements/etc.

I have checked arch-m68k on ColdFire with and without MMU, and it
is all fine. So for those:

Acked-by: Greg Ungerer <gerg@...inux.org>

Regards
Greg



> I've merged some of those into for-next, but that can change at any time -
> it's not final; for-next will be rebased.  Obviously, I hope to get to
> the situation when all of those branches (plus currently missing ones)
> get into shape that satisfies architecture maintainers.  Once that happens,
> all those branches will be merged into for-next.
>
> I think the model is about final wrt kernel_thread()/kernel_execve()/
> sys_execve().  There's one possible change on top of it, but it's reasonably
> well-isolated from the rest.  As it is, the model to aim for is this:
> 	* select GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE
> 	* kill local kernel_thread()/kernel_execve() implementations
> 	* generic kernel_thread() will call your copy_thread() with
> NULL regs and fn/arg passed in the pair of arguments that are blindly
> passed all the way through to copy_thread() - usp and stack_size resp.
> In such case copy_thread() should arrange for the newborn to be woken
> up in a function that is very similar to ret_from_fork().  The only
> difference is that between the call of schedule_tail() and jumping into
> the "return from syscall" code it should call fn(arg), using the data
> left for it by copy_thread().
> 	* unlike the previous variant, ret_from_kernel_execve() is not
> needed at all; no need to play longjmp()-like games when kernel_thread()
> callbacks had been taught to return normally all the way out when
> kernel_execve() returns 0; any updates of sp/manipulations of register
> windows/etc. will happen without any magic.
> 	* provide current_pt_regs() if needed.  Default is
> task_pt_regs(current), but you might want to optimize it and unlike
> task_pt_regs() it must work whenever we are in syscall or in a kernel thread.
> task_pt_regs(task), OTOH, is required to work only when task can be
> interrogated by tracer.
> 	* no more syscalls-from-kernel, which often allows for simplifications
> in the syscall entry/exit logics.  I haven't done any of those; up to the
> architecture maintainers.
>
> 	One thing to keep in mind is that right now on SMP architectures
> there's the third caller of copy_thread(), besides fork()/clone()/vfork()
> (all pass userland pt_regs, with the address being current_pt_regs()) and
> kernel_thread() (pass NULL pt_regs, kthread creation time).  It's fork_idle()
> and it passes zero-filled pt_regs.  Frankly, I'm not even sure we want to
> call copy_thread() in that case - the stuff set up by it goes nowhere.
> We do that for each possible secondary CPU on SMP and we do *not* expose
> those threads to scheduler.  When CPU gets initialized we have the
> secondary bootstrap take that task_struct as current.  Its kernel stack,
> thread_info, etc. are set up by said secondary bootstrap, overriding whatever
> copy_thread() has done.  Eventually the bootstrap reaches cpu_idle(),
> which is where we schedule away.  switch_to() done by schedule() is what
> completes setting the things up; at that point they are ready to be woken
> up - and not in ret_from_fork(), of course.
> 	For the majority of architectures nothing done by copy_thread() in
> that case is used afterwards, so we might as well stop calling it when
> copy_process() is called by fork_idle().  I know of only one dubious case -
> powerpc sets thread->ksp_limit on copy_thread() and I'm not sure if
> that's get overwritten in secondary bootstrap - the value would be still
> correct and I don't see any obvious places where it would be reassigned
> on that codepath.  There might be other cases like that, though.  I would
> argue that for this kind of stuff the right place is arch_dup_task_struct(),
> not copy_thread()...  Hell knows.  Note that we are pretty much hitting
> the random path in copy_thread() in that case - what zeroed pt_regs look
> like to user_regs() is arch-dependent.
>
> 	This is the possible change I've mentioned above.  Not sure; I'd
> really like comments on that one.
>
> 	Branches in there:
> arch-blackfin - conversion; completely untested
> arch-cris - conversion; completely untested
> arch-h8300 - conversion; completely untested
> arch-microblaze - conversion; completely untested
> arch-sh - conversion; completely untested
> arch-unicore32 - conversion; completely untested
> arch-ia64 - conversion; tested only on ski, which is worth very little
> arch-c6x - followup to mainline; while it's minor, it's pretty much done
> blindly and *really* needs review by maintainer.
> arch-arm - contains heroic fix by rmk and nothing else.  Seems to work fine.
> arch-m68k - minor followup to stuff already in mainline; works on aranym
> arch-parisc - mostly the stuff tested by parisc folks + minor followup
> similar to m68k one.
> arch-s390 - minor followup to mainline; works in hercules
> arch-arm64 - patches from maintainer with minor followup folded
> arch-frv - minor followup to mainline, needs testing
> arch-mn10300 - minor followup to mainline, needs testing
> arch-mips - patches from me and Ralf; works on qemu
> arch-sparc - conversions for sparc32 and sparc64, plus the syscall_noerror
> optimization
> arch-powerpc - minor followups to mainline, need review by maintainers
>
> "Completely untested" in the above reads "no promises it even compiles, let
> alone isn't horribly broken".  Please, treat that as a possible starting
> point for doing the conversion for arch in question.  I might have misread
> the CPU manuals, your switch_to() implementation, etc., or just have been
> temporary insane from digging through dozens of architectures.  Hopefully
> temporary, that is...
>
> And folks, for pity sake, do the remaining seven.  The merge window is
> over, so...
>
> 			Al, buggering off to get some VFS work done.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arch" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
------------------------------------------------------------------------
Greg Ungerer  --  Principal Engineer        EMAIL:     gerg@...pgear.com
SnapGear Group, McAfee                      PHONE:       +61 7 3435 2888
8 Gardner Close                             FAX:         +61 7 3217 5323
Milton, QLD, 4064, Australia                WEB: http://www.SnapGear.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ