lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090803175452.GA5481@us.ibm.com>
Date:	Mon, 3 Aug 2009 12:54:52 -0500
From:	"Serge E. Hallyn" <serue@...ibm.com>
To:	Ben Blum <bblum@...gle.com>
Cc:	linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, akpm@...ux-foundation.org,
	lizf@...fujitsu.com, menage@...gle.com
Subject: Re: [PATCH 6/6] Makes procs file writable to move all threads by
	tgid at once

Quoting Ben Blum (bblum@...gle.com):
...
> +static int cgroup_task_migrate(struct cgroup *cgrp, struct cgroup *oldcgrp,
> +			       struct task_struct *tsk, int guarantee)
> +{
> +	struct css_set *oldcg;
> +	struct css_set *newcg;
> +
> +	/*
> +	 * get old css_set. we need to take task_lock and refcount it, because
> +	 * an exiting task can change its css_set to init_css_set and drop its
> +	 * old one without taking cgroup_mutex.
> +	 */
> +	task_lock(tsk);
> +	oldcg = tsk->cgroups;
> +	get_css_set(oldcg);
> +	task_unlock(tsk);
> +	/*
> +	 * locate or allocate a new css_set for this task. 'guarantee' tells
> +	 * us whether or not we are sure that a new css_set already exists;
> +	 * in that case, we are not allowed to fail, as we won't need malloc.
> +	 */
> +	if (guarantee) {
> +		/*
> +		 * our caller promises us that the css_set we want already
> +		 * exists, so we use find_existing_css_set directly.
> +		 */
> +		struct cgroup_subsys_state *template[CGROUP_SUBSYS_COUNT];
> +		read_lock(&css_set_lock);
> +		newcg = find_existing_css_set(oldcg, cgrp, template);
> +		BUG_ON(!newcg);
> +		get_css_set(newcg);
> +		read_unlock(&css_set_lock);
> +	} else {
> +		might_sleep();

So cgroup_task_migrate() might sleep, but

...


> +	down_write(&leader->cgroup_fork_mutex);
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(tsk, &leader->thread_group, thread_group) {
> +		/* leave current thread as it is if it's already there */
> +		oldcgrp = task_cgroup(tsk, subsys_id);
> +		if (cgrp == oldcgrp)
> +			continue;
> +		/* we don't care whether these threads are exiting */
> +		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, 1);

Here it is called under rcu_read_lock().

...

> -void cgroup_fork(struct task_struct *child)
> +void cgroup_fork(struct task_struct *child, int clone_flags)
>  {
> +	if (clone_flags & CLONE_THREAD)
> +		down_read(&current->group_leader->cgroup_fork_mutex);
> +	else
> +		init_rwsem(&child->cgroup_fork_mutex);

I'm also worried about the overhead here on what should be a
fast case, CLONE_THREAD.  Have you done any benchmarking of
one thread spawning a bunch of others?

What *exactly* is it we are protecting with cgroup_fork_mutex?
'fork' (as the name implies) is not a good answer, since we should be
protecting data, not code.  If it is solely tsk->cgroups, then perhaps
we should in fact try switching to (s?)rcu.  Then cgroup_fork() could
just do rcu_read_lock, while cgroup_task_migrate() would make the change
under a spinlock (to protect against concurrent cgroup_task_migrate()s),
and using rcu_assign_pointer to let cgroup_fork() see consistent data
either before or after the update...  That might mean that any checks done
before completing the migrate which involve the # of tasks might become
invalidated before the migration completes?  Seems acceptable (since
it'll be a small overcharge at most and can be quickly remedied).

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ