linux-kernel - Re: [PATCH 2/3] sched, cgroup: replace signal_struct->group_rwsem with a global percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150519151659.GF3644@twins.programming.kicks-ass.net>
Date:	Tue, 19 May 2015 17:16:59 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Tejun Heo <tj@...nel.org>
Cc:	lizefan@...wei.com, cgroups@...r.kernel.org, mingo@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] sched, cgroup: replace signal_struct->group_rwsem
 with a global percpu_rwsem

On Wed, May 13, 2015 at 04:35:17PM -0400, Tejun Heo wrote:

.gitconfig:

[diff "default"]
        xfuncname = "^[[:alpha:]$_].*[^:]$"

Will avoid keying on labels like that and show us this is
__cgroup_procs_write().

> @@ -2480,7 +2442,7 @@ retry_find_task:
>  	get_task_struct(tsk);
>  	rcu_read_unlock();
>  
> -	threadgroup_lock(tsk);
> +	percpu_down_write(&cgroup_threadgroup_rwsem);
>  	if (threadgroup) {
>  		if (!thread_group_leader(tsk)) {
>  			/*
> @@ -2490,7 +2452,7 @@ retry_find_task:
>  			 * try again; this is
>  			 * "double-double-toil-and-trouble-check locking".
>  			 */
> -			threadgroup_unlock(tsk);
> +			percpu_up_write(&cgroup_threadgroup_rwsem);
>  			put_task_struct(tsk);
>  			goto retry_find_task;
>  		}

> @@ -2703,17 +2665,17 @@ static int cgroup_update_dfl_csses(struct cgroup *cgrp)
>  				goto out_finish;
>  			last_task = task;
>  
> -			threadgroup_lock(task);
> +			percpu_down_write(&cgroup_threadgroup_rwsem);
>  			/* raced against de_thread() from another thread? */
>  			if (!thread_group_leader(task)) {
> -				threadgroup_unlock(task);
> +				percpu_up_write(&cgroup_threadgroup_rwsem);
>  				put_task_struct(task);
>  				continue;
>  			}
>  
>  			ret = cgroup_migrate(src_cset->dfl_cgrp, task, true);
>  
> -			threadgroup_unlock(task);
> +			percpu_up_write(&cgroup_threadgroup_rwsem);
>  			put_task_struct(task);
>  
>  			if (WARN(ret, "cgroup: failed to update controllers for the default hierarchy (%d), further operations may crash or hang\n", ret))


So my only worry with this patch-set is that these operations will be
hugely expensive.

Now it looks like the cgroup_update_dfl_csses() thing is very rare, its
when you change which controllers are active in a given subtree under
the uber-l337-super-comount design.

The other one, __cgorup_procs_write() is every /procs, /tasks write to a
cgroup, and that does worry me, this could be a somewhat common thing.

The Changelog states task migration is a cold path, but is tens of
miliseconds per task really no problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/