[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090626082956.a90335db.kamezawa.hiroyu@jp.fujitsu.com>
Date: Fri, 26 Jun 2009 08:29:56 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
nishimura@....nes.nec.co.jp, balbir@...ux.vnet.ibm.com,
lizf@...fujitsu.com, menage@...gle.com
Subject: Re: [PATCH 0/2] memcg: cgroup fix rmdir hang
On Thu, 25 Jun 2009 14:28:09 -0700
Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Tue, 23 Jun 2009 16:07:20 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
>
> > previous discussion was this => http://marc.info/?t=124478543600001&r=1&w=2
> >
> > This patch tries to fix problem as
> > - rmdir can sleep very very long if swap entry is shared between multiple
> > cgroups
> >
> > Now, cgroup's rmdir path does following
> >
> > ==
> > again:
> > check there are no tasks and children group.
> > call pre_destroy()
> > check css's refcnt
> > if (refcnt > 0) {
> > sleep until css's refcnt goes down to 0.
> > goto again
> > }
> > ==
> >
> > Unfortunately, memory cgroup does following at charge.
> >
> > css_get(&memcg->css)
> > ....
> > charge(memcg) (increase USAGE)
> > ...
> > And this "memcg" is not necessary to include the caller, task.
> >
> > pre_destroy() tries to reduce memory usage until USAGE goes down to 0.
> > Then, there is a race that
> > - css's refcnt > 0 (and memcg's usage > 0)
> > - rmdir() caller sleeps until css->refcnt goes down 0.
> > - But to make css->refcnt be 0, pre_destroy() should be called again.
> >
> > This patch tries to fix this in asyhcnrounos way (i.e. without big lock.)
> > Any comments are welcome.
> >
>
> Do you believe that these fixes should be backported into 2.6.30.x?
Yes, I think so. (If it's easy)
To be honest:
To cause the problem,
- swap cgroup should be shared between cgroup.
- rmdir should be called in critical chance.
Considering usual usage of cgroup is "container", there will be no share of swap
in typical users. But, 2.6.30 can be a base kernel of a major distro. So,
I hope this in 2.6.30 if we have no difficulties.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists