[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <48A79690.1090600@cn.fujitsu.com>
Date: Sun, 17 Aug 2008 11:10:08 +0800
From: Li Zefan <lizf@...fujitsu.com>
To: "IKEDA, Munehiro" <m-ikeda@...jp.nec.com>
CC: linux-kernel@...r.kernel.org, menage@...gle.com,
Linux Containers <containers@...ts.linux-foundation.org>,
balbir@...ux.vnet.ibm.com
Subject: Re: [PATCH] cgroup: memory.force_empty can make system slowdown
Li Zefan wrote:
> IKEDA, Munehiro wrote:
>> Cgroup's memory controller has a control file "memory.force_empty"
>> to reset usage account charged to a cgroup. The account shouldn't
>> be reset if one or more processes are attached to the cgroup (at
>> least for memory controller, IMHO). So mem_cgroup_force_empty()
>> is implemented to return -EBUSY and do nothing if so.
>> However, cgroup on hierarchy root faultily might be a exception.
>> Even if processes are attached to root cgroup (which is a "default"
>> cgroup for processes), forcing-empty can run by writing something to
>> memory.force_empty and it'll never end.
>>
>
> I found this bug last week, and I've made patches to fix it, but then
> I was on vacation. I'll send the patches out soon.
>
>> Following patch prevents this issue.
>>
>> This patch is for cgroup infrastructure code. The issue can be
>> measured by modifying memory controller code also, namely to change
>> mem_cgroup_force_empty() to see CSS_ROOT bit of css->flags.
>> I believe cgroup->count approach like the patch below is rather
>> generic and reasonable, how does that sound?
>>
>
> It's ok for the top_group's count to be 0 due to the top_cgroup hack.
> With this patch, the top cgroup's count will be always >0, even if it
> has no tasks in it, so writing to top_cgroup's force_empty will always
> return -EBUSY.
>
I thought cgrp->css_sets will be empty when there are no tasks in the top cgroup,
but I was wrong, because init_css_set's refcount will always >0,
so cgroup_task_count() won't return 0 for the top cgroup:
# mount -t cgroup -o debug xxx /mnt
# mkdir /mnt/sub
# for pid in `cat /mnt/tasks`; do echo $pid > /mnt/sub/tasks; done
# cat /mnt/tasks
# cat /mnt/debug.taskcount
3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists