linux-kernel - Re: [RFC][PATCH 0/3] fork: Add the ability to create tasks with given pids

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4ECBCE30.30001@parallels.com>
Date:	Tue, 22 Nov 2011 20:30:40 +0400
From:	Pavel Emelyanov <xemul@...allels.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alan Cox <alan@...ux.intel.com>,
	Roland McGrath <roland@...k.frob.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Cyrill Gorcunov <gorcunov@...nvz.org>,
	James Bottomley <jbottomley@...allels.com>
Subject: Re: [RFC][PATCH 0/3] fork: Add the ability to create tasks with given
 pids

On 11/22/2011 07:23 PM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Nov 22, 2011 at 03:11:02PM +0400, Pavel Emelyanov wrote:
>>> Hmmm... I hope this could be prettier.  I'm having trouble following
>>> where the MAY_OPEN comes from.  Can you please explain?
>>
>> From this calltrace:
>>
>>  pid_ns_ctl_permissions
>>  sysctl_perm
>>  proc_sys_permission
>>  inode_permission
>>  do_last <<<<< MAY_OPEN appears here
>>  path_openat
>>  do_filp_open
>>  do_sys_open
>>  sys_open
> 
> Thanks a lot. :)
> 
>>> Can't we for now allow this for root and then later allow CAP_CHECKPOINT 
>>> that Cyrill suggested?  Or do we want to allow setting pids even w/o CR 
>>> for NS creator?
>>
>> I think that systemd guys can play with it. E.g. respawning daemons with predefined
>> pids sounds like an interesting thing to play with.
> 
> But wouldn't CAP_CHECKPOINT be enough for systemd?

It would, but what's the point in granting to a systemd (which can be a container's
init by the way) the ability to use the _whole_ checkpoint/restore engine?

Even more - protecting with the capability implies, that any task might want to play
with it. But what's the point for an arbitrary task, that just _lives_ in a pid namespace
to set the last_pid of its namespace?

>>>> +static int pid_ns_ctl_handler(struct ctl_table *table, int write,
>>>> +		     void __user *buffer, size_t *lenp, loff_t *ppos)
>>>> +{
>>>> +	struct ctl_table tmp = *table;
>>>> +	tmp.data = &current->nsproxy->pid_ns->last_pid;
>>>> +	return proc_dointvec(&tmp, write, buffer, lenp, ppos);
>>>> +}
>>>
>>> Probably better to call set_last_pid() on write path instead?
>>
>> Why? The usage of this sysctl is going to be synchronized  by external locks,
>> so why should we care?
> 
> I think the question should usually be the other way around.  Why
> deviate when the deviation doesn't earn any tangible benefit?  If you
> think setting it explicitly is justified, explain why in the comment
> of the setter and places where those explicit settings are.

The set_last_pid() is the way to update the last_pid by two concurrent updaters. Since
setting the last_pid via sysctl is racy by its nature, using that race protection is
just pointless.

And yes, I agree, that writing this comment is a good idea :)

> Thanks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/