linux-kernel - Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a cgroup being destroyed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABEgKgrJ68wU-L17zwN4_htX948TNFnLVgts=hFeY7QG3etwCA@mail.gmail.com>
Date:	Sat, 28 Apr 2012 09:20:52 +0900
From:	Hiroyuki Kamezawa <kamezawa.hiroyuki@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
	Michal Hocko <mhocko@...e.cz>,
	Johannes Weiner <hannes@...xchg.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Glauber Costa <glommer@...allels.com>,
	Han Ying <yinghan@...gle.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a
 cgroup being destroyed

On Sat, Apr 28, 2012 at 5:40 AM, Tejun Heo <tj@...nel.org> wrote:
> On Fri, Apr 27, 2012 at 03:04:14PM +0900, KAMEZAWA Hiroyuki wrote:
>> When ->pre_destroy() is called, it should be guaranteed that
>> new child cgroup is not created under a cgroup, where pre_destroy()
>> is running. If not, ->pre_destroy() must check children and
>> return -EBUSY, which causes warning.
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
>
> Hmm... I'm getting confused more.  Why do we need these cgroup changes
> at all?  cgroup still has cgrp->count check and
> cgroup_clear_css_refs() after pre_destroy() calls.  The order of
> changes should be,
>
> * Make memcg pre_destroy() not fail; however, pre_destroy() should
>  still be ready to be retried.  That's the defined interface.
>
> * cgroup core updated to drop pre_destroy() retrying and guarantee
>  that pre_destroy() invocation will happen only once.
>
> * memcg and other cgroups can update their pre_destroy() if the "won't
>  be retried" part can simplify their implementations.
>

What I thought was...
Assume a memory cgoup A, with use_hierarchy==1.

1.  thread:0   start calling pre->destroy of cgroup A
2.  thread:0   it sometimes calls cond_resched or other sleep functions.
3.  thread:1   create a cgroup B under "A"
4.  thread:1   attach a thread X to cgroup A/B
5.  res_counter of A charged up. but pre_destroy() can't find what happens
    because it scans LRU of A.

So, we have -EBUSY now. I considered some options to fix this.

option 1) just return 0 instead of -EBUSY when pre_destroy() finds a
task or a child.

There is a race....even if we return 0 here and expects cgroup code
can catch it,
the thread or a child we found may be moved to other cgroup before we check it
in cgroup's final check.
In that case, the cgroup will be freed before full-ack of
pre_destory() and the charges
will be lost.

option 2) move all codes to ->destory()
That was previous version of this set.

This is option3 that preventing creation of new child.

If you don't like this, I'll move all codes to ->destroy() and use
asynchronous again.

Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/