linux-kernel - Re: [RFC][PATCH] cgroup: fix race between fork and cgroup freezing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F5DBB8C.6090904@cn.fujitsu.com>
Date:	Mon, 12 Mar 2012 17:02:04 +0800
From:	Li Zefan <lizf@...fujitsu.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Cgroups <cgroups@...r.kernel.org>, Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	缪 勰 <miaox@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC][PATCH] cgroup: fix race between fork and cgroup freezing

Tejun Heo wrote:
> Hello, Li.
> 
> On Fri, Mar 09, 2012 at 02:26:05PM +0800, Li Zefan wrote:
>> The problem is, forks can happen at any time, so there's no way to prevent
>> forks from happening while iterating tasks in a cgroup, so controllers
>> have to deal with it. In fact freezer is somewhat aware of this issue,
>> that's why it provides the ->fork callback, but there's race.
>>
>> This patch is not too bad (needs a bit modification). cgroup core will detect
>> (via seqcount) if something's happened to a cgroup and the tasks in it, and
>> then cgroup will notify controllers to check if newly-forked tasks should
>> be adjusted accordingly, so they will have consistent status with other tasks
>> in the same cgroup.
> 
> But why can't we just do what every sane subsystem would do - link
> first and then invoke notification callback?  I mean, we're now
> essentially trying to do the following.
> 
> 1. Take some action.
> 2. Trigger notification.
> 3. Link the result of the action to list.
> 
> So, of course, if someone tries to traverse the "results", there's a
> race window between #2 and #3.  Your fix seems to change the traverser
> to,
> 
> 1. Traverse the list.
> 2. If something happened inbetween, take another look.
> 
> But, the right thing to do would be changing the fork path to
> 
> 1. Take some action.
> 2. Link the result of the action to list.
> 3. Trigger notification.
> 

The reasons are

- We still need some kind of locking to syncronize fork and the traverser.
fork side is protected by tasklist_lock, while the traverser takes
css_set_lock.

- After linking the new task to css set list, the task is visible and thus
can be moved to another cgroup, which makes things more complicated and
the subsystem callbacks may have to acquire cgroup_mutex.

- The task_counter subsystem wants to get notified before the new task
is linked, so it's able to abort the fork.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/