lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AE20124.4010108@librato.com>
Date:	Fri, 23 Oct 2009 15:16:52 -0400
From:	Oren Laadan <orenl@...rato.com>
To:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
CC:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Matt Helsley <matthltc@...ibm.com>,
	Daniel Lezcano <daniel.lezcano@...e.fr>,
	randy.dunlap@...cle.com, arnd@...db.de, linux-api@...r.kernel.org,
	Containers <containers@...ts.linux-foundation.org>,
	Nathan Lynch <nathanl@...tin.ibm.com>,
	linux-kernel@...r.kernel.org, Louis.Rilling@...labs.com,
	kosaki.motohiro@...fujitsu.com, hpa@...or.com, mingo@...e.hu,
	torvalds@...ux-foundation.org,
	Alexey Dobriyan <adobriyan@...il.com>, roland@...hat.com,
	Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [RFC][v8][PATCH 0/10] Implement clone3() system call



Sukadev Bhattiprolu wrote:
> Eric W. Biederman [ebiederm@...ssion.com] wrote:
> | > | +	if (target < RESERVED_PIDS)
> | >
> | > Should we replace RESERVED_PIDS with 0 ? We currently allow new
> | > containers to have pids 1..32K in the first pass and in subsequent
> | > passes assign starting at RESERVED_PIDS.
> | 
> | If it is a preexisting namespace pid namespace removing the RESERVED_PIDS
> | check removes most if not all of the point of RESERVED_PIDS.
> | 
> | In a new fresh pid namespace I have no problem with not performing
> | the RESERVED_PIDS check.
> 
> In that case can we do this
> 
> 	if (target_pid < RESERVED_PIDS && !pid_ns->level)
> 		return -EINVAL;
> 
> instead ?
> | 
> | So I guess that makes the check.
> | 
> | if ((target < RESERVED_PIDS) && pid_ns->last_pid >= RESERVED_PIDS)
> |    return -EINVAL;
> 
> I am just wondering if there is a small corner case where C/R would randomly
> fail because of this sequence:
> 
> 	- C/R code calls clone() or clone3() say about RESERVED_PIDS-1
> 	  times and ->last_pid == RESERVED_PIDS-1.
> 
> 	- C/R code calls normal fork()/alloc_pidmap() for a short-lived
> 	  child - its pid == ->last_pid == RESERVED_PIDS
> 
> 	- C/R code then calls clone3()/set_pidmap() to set the pid of
> 	  a new child to RESERVED_PID but fails (i.e it fails to restore
> 	  a pid even when the pid is not in use).

Not only for short-lived children. The problem is restart will succeed
or fail depending on the order in which tasks were checkpointed. If
task with pid 290 is restarted after pid 305, restart will fail.

And because chekcpoint scans the task tree in a DFS manner, this is
more likely to happen than not.

I wonder why you'd like to restrict a pid-specific clone like that ?
It is already a privileged syscall, so it could be exempt. I suggest
that only regular clones will be constrained.

Oren.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ