linux-kernel - Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a cgroup being destroyed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABEgKgpPXPu3L6oS6+2+dZmcPS=t-ZR7PnCvm0mo8UFeXPHDog@mail.gmail.com>
Date:	Sat, 28 Apr 2012 18:31:38 +0900
From:	Hiroyuki Kamezawa <kamezawa.hiroyuki@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
	Michal Hocko <mhocko@...e.cz>,
	Johannes Weiner <hannes@...xchg.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Glauber Costa <glommer@...allels.com>,
	Han Ying <yinghan@...gle.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a
 cgroup being destroyed

On Sat, Apr 28, 2012 at 11:00 AM, Tejun Heo <tj@...nel.org> wrote:
> Hi, KAME.
>
> On Sat, Apr 28, 2012 at 09:20:52AM +0900, Hiroyuki Kamezawa wrote:
>> What I thought was...
>> Assume a memory cgoup A, with use_hierarchy==1.
>>
>> 1.  thread:0   start calling pre->destroy of cgroup A
>> 2.  thread:0   it sometimes calls cond_resched or other sleep functions.
>> 3.  thread:1   create a cgroup B under "A"
>> 4.  thread:1   attach a thread X to cgroup A/B
>> 5.  res_counter of A charged up. but pre_destroy() can't find what happens
>>     because it scans LRU of A.
>>
>> So, we have -EBUSY now. I considered some options to fix this.
>>
>> option 1) just return 0 instead of -EBUSY when pre_destroy() finds a
>> task or a child.
>>
>> There is a race....even if we return 0 here and expects cgroup code
>> can catch it,
>> the thread or a child we found may be moved to other cgroup before we check it
>> in cgroup's final check.
>> In that case, the cgroup will be freed before full-ack of
>> pre_destory() and the charges
>> will be lost.
>
> So, cgroup code won't proceed with rmdir if children are created
> inbetween and note that the race condition of lost charge you
> described above existed before this change - ie. new cgroup could be
> created after pre_destroy() is complete.
>
> The current cgroup rmdir code is transitional.  It has to support both
> retrying and non-retrying pre_destroy()s and that means we can't mark
> the cgroup DEAD before starting invoking pre_destroy(); however, we
> can do that once memcg's pre_destroy() is converted which will also
> remove all the WAIT_ON_RMDIR mechanism and the above described race.
>
> There really isn't much point in trying to make the current cgroup
> rmdir behave perfectly when the next step is removing all the fixed up
> parts.
>
> So, IMHO, just making pre_destroy() clean up its own charges and
> always returning 0 is enough.  There's no need to fix up old
> non-critical race condition at this point in the patch stream.  cgroup
> rmdir simplification will make them disappear anyway.
>
So, hmm, ok. I'll drop patch 7 & 8. memcg may return -EBUSY in very very
race case but users will not see it in the most case.
I'll fix limit, move-charge and use_hierarchy problem first.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/