linux-kernel - Re: [RFC PATCH linux-next] ns: do not allocate a new nsproxy at each call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y55u1ip2.fsf@xmission.com>
Date:	Tue, 15 Oct 2013 11:40:25 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"Guillaume Gaudonville" <gaudonville@...nd.com>
Cc:	linux-kernel@...r.kernel.org, serge.hallyn@...onical.com,
	akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
	davem@...emloft.net, cmetcalf@...era.com,
	Guillaume Gaudonville <guillaume.gaudonville@...nd.com>
Subject: Re: [RFC PATCH linux-next] ns: do not allocate a new nsproxy at each call

"Guillaume Gaudonville" <gaudonville@...nd.com> writes:

> Currently, at each call of setns system call a new nsproxy is allocated,
> the old nsproxy namespaces are copied into the new one and the old nsproxy
> is freed if the task was the only one to use it.
>
> It can creates large delays on hardware with large number of cpus since
> to free a nsproxy a synchronize_rcu() call is done.
>
> When a task is the only one to use a nsproxy, only the task can do an action
> that will make this nsproxy to be shared by another task or thread (fork,...).
> So when the refcount of the nsproxy is equal to 1, we can simply update the
> current nsproxy field without allocating a new one and freeing the old one.
>
> The install operations of each kind of namespace cannot fails, so there's no
> need to check for an error and calling ops->install().
>
> Tested on TileGX (36 cores) and Intel (32 cores).

This may be worth doing (I am a little scared of a design that has setns
on a fast path) but right now this isn't safe.

Currently pidns_install ends with:

	put_pid_ns(nsproxy->pid_ns_for_children);
	nsproxy->pid_ns_for_children = get_pid_ns(new);
	return 0;


And netns_install ends with:

	put_net(nsproxy->net_ns);
	nsproxy->net_ns = get_net(net);
	return 0;

The put before the set is not atomic and is not safe unless
the nsproxy is private.  I think this is fixable but it requires a more
indepth look at the code than you have done.

Mind if I ask where this comes up?


> Reported-by: Chris Metcalf <cmetcalf@...era.com>
> Signed-off-by: Guillaume Gaudonville <guillaume.gaudonville@...nd.com>
> ---
>  kernel/nsproxy.c |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index afc0456..afc04ac 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -255,6 +255,18 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype)
>  	if (nstype && (ops->type != nstype))
>  		goto out;
>  
> +	/*
> +	 * If count == 1, only the current task can increment it,
> +	 * by doing a fork for example so we can safely update the
> +	 * current nsproxy pointers without allocate a new one,
> +	 * update it and destroy the old one
> +	 */
> +	if (atomic_read(&tsk->nsproxy->count) == 1) {
> +		err = ops->install(tsk->nsproxy, ei->ns);
> +		fput(file);
> +		return err;
> +	}

As a minor nit, but to match the rest of the code in this function that
should read:

> +	if (atomic_read(&tsk->nsproxy->count) == 1) {
> +		err = ops->install(tsk->nsproxy, ei->ns);
> +		goto out;
> +	}

There is no need to add an additional exit point to reason about.

> +
>  	new_nsproxy = create_new_namespaces(0, tsk, current_user_ns(), tsk->fs);
>  	if (IS_ERR(new_nsproxy)) {
>  		err = PTR_ERR(new_nsproxy);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/