linux-kernel - Re: [RFC][v8][PATCH 0/10] Implement clone3() system call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1r5sxsw7w.fsf@fess.ebiederm.org>
Date:	Tue, 20 Oct 2009 12:26:27 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
Cc:	Matt Helsley <matthltc@...ibm.com>,
	Oren Laadan <orenl@...rato.com>,
	Daniel Lezcano <daniel.lezcano@...e.fr>,
	randy.dunlap@...cle.com, arnd@...db.de, linux-api@...r.kernel.org,
	Containers <containers@...ts.linux-foundation.org>,
	Nathan Lynch <nathanl@...tin.ibm.com>,
	linux-kernel@...r.kernel.org, Louis.Rilling@...labs.com,
	kosaki.motohiro@...fujitsu.com, hpa@...or.com, mingo@...e.hu,
	torvalds@...ux-foundation.org,
	Alexey Dobriyan <adobriyan@...il.com>, roland@...hat.com,
	Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [RFC][v8][PATCH 0/10] Implement clone3() system call

Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com> writes:

> Eric W. Biederman [ebiederm@...ssion.com] wrote:
> | > Could you clarify ? How is the call to alloc_pidmap() from clone3() different
> | > from the call from clone() itself ?
> | 
> | I think it is totally inappropriate to assign pids in a pid namespace
> | where there are user space processes already running.
>
> Honestly, I don't understand why it is inappropriate or how this differs
> from normal clone() - which also assigns pids in own and ancestor pid
> namespaces.

The fact we can specify which pids we want.  I won't claim it is as
exploitable as NULL pointer deferences have been but it has that kind
of feel to it.

> | > | How we handle a clone extension depends critically on if we want to
> | > | create a processes for restart in user space or kernel space.
> | > | 
> | > | Could some one give me or point me at a strong case for creating the
> | > | processes for restart in user space?
> | >
> | > There has been a lot of discussion on this with reference to the
> | > Checkpoint/Restart patchset. See http://lkml.org/lkml/2009/4/13/401
> | > for instance.
> | 
> | Just read it.  Thank you.
>
> Sorry. I should have mentioned the reason here. (Like you mention below),
> flexibility is the main reason.
>
> | Now I am certain clone_with_pids() is not useful functionality to be
> | exporting to userspace.
> | 
> | The only real argument in favor of doing this in user space is greater
> | flexibility.  I can see checkpointing/restoring a single thread process
> | without a pid namespace.  Anything more and you are just asking for
> | trouble.
> | 
> | A design that weakens security.  Increases maintenance costs.  All for
> | an unreliable result seems like a bad one to me.
> | 
> | > | The pid assignment code is currently ugly.  I asked that we just pass
> | > | in the min max pid pids that already exist into the core pid
> | > | assignment function and a constrained min/max that only admits a
> | > | single pid when we are allocating a struct pid for restart.  That was
> | > | not done and now we have a weird abortion with unnecessary special cases.
> | >
> | > I did post a version of the patch attemptint to implement that. As
> | > pointed out in:
> | >
> | > 	http://lkml.org/lkml/2009/8/17/445
> | >
> | > we would need more checks in alloc_pidmap() to cover cases like min or max
> | > being invalid or min being greater than max or max being greater than pid_max
> | > etc. Those checks also made the code ugly (imo).
> | 
> | If you need more checks you are doing it wrong.  The code already has min
> | and max values, and even a start value.  I was just strongly suggesting
> | we generalize where we get the values from, and then we have not special
> | cases. 
>
> Well, if alloc_pidmap(pid_ns, min, max) does not have to check the
> parameters passed in (ie assumes that callers pass it in correctly)
> it might be simple. But when user specifies the pid, the 
>
> 	min == max == user's target pid
>
> so we will need to check the values either here or in callers.

Agreed.  When you are talking about the target pid.  That code path
needs the extra check.

> Yes the code already has values and a start value. But these are
> controlled by alloc_pidmap() and not passed in from the user space.

I was only thinking passed in from someplace else in kernel/pid.c

> alloc_pidmap() needs to assign the next available pid or a specific
> target pid.  Generalizing it to alloc a pid in a range seemed be a
> bit of an over kill for currently known usages.

alloc_pidmap in assigning the next available pid is allocating a pid
in a range.

> I will post a version of the patch outside this patchset with min
> and max parameters and we can see if it can be optimized/beautified.

Thanks,
Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/