[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080701191126.GA17376@redhat.com>
Date: Tue, 1 Jul 2008 15:11:26 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: linux kernel mailing list <linux-kernel@...r.kernel.org>
Cc: Libcg Devel Mailing List <libcg-devel@...ts.sourceforge.net>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
Paul Menage <menage@...gle.com>,
Peter Zijlstra <pzijlstr@...hat.com>,
kamezawa.hiroyu@...fujitsu.com,
Kazunaga Ikeno <k-ikeno@...jp.nec.com>,
Morton Andrew Morton <akpm@...ux-foundation.org>
Subject: [RFC] How to handle the rules engine for cgroups
Hi,
While development is going on for cgroup and various controllers, we also
need a facility so that an admin/user can specify the group creation and
also specify the rules based on which tasks should be placed in respective
groups. Group creation part will be handled by libcg which is already
under development. We still need to tackle the issue of how to specify
the rules and how these rules are enforced (rules engine).
I have gathered few views, with regards to how rule engine can possibly be
implemented, I am listing these down.
Proposal 1
==========
Let user space daemon hanle all that. Daemon will open a netlink socket
and receive the notifications for various kernel events. Daemon will
also parse appropriate admin specified rules config file and place the
processes in right cgroup based on rules as and when events happen.
I have written a prototype user space program which does that. Program
can be found here. Currently it is in very crude shape.
http://people.redhat.com/vgoyal/misc/rules-engine-daemon/user-id-based-namespaces.patch
Various people have raised two main issues with this approach.
- netlink is not a reliable protocol.
- Messages can be dropped and one can loose message. That means a
newly forked process might never go into right group as meant.
- How to handle delays in rule exectuion?
- For example, if an "exec" happens and by the time process is moved to
right group, it might have forked off few more processes or might
have done quite some amount of memory allocation which will be
charged to the wring group. Or, newly exec process might get
killed in existing cgroup because of lack of memory (despite the
fact that destination cgroup has sufficient memory).
Proposal 2
==========
Implement one or more kernel modules which will implement the rule engine.
User space program can parse the config files and pass it to module.
Kernel will be patched only on select points to look for the rules (as
provided by modules). Very minimal code running inside the kernel if there
are no rules loaded.
Concerns:
- Rules can become complex and we don't want to handle that complexity in
kernel.
Pros:
- Reliable and precise movement of tasks in right cgroup based on rules.
Proposal 3
==========
How about if additional parameters can be passed to system calls and one
can pass destination cgroup as additional parameter. Probably something
like sys_indirect proposal. Maybe glibc can act as a wrapper to pass
additional parameter so that applications don't need any modifications.
Concerns:
========
- Looks like sys_indirect interface for passing extra flags was rejected.
- Requires extra work in glibc which can also involve parsing of rule
files. :-(
Proposal 4
==========
Some vauge thoughts are there regarding how about kind of freezing the
process or thread upon fork, exec and unfreeze it once the thread has been
placed in right cgroup.
Concerns:
========
- Requires reliable netlink protocol otherwise there is a possibility that
a task never gets unfrozen.
- On what basis does one freeze a thread. There might not be any rules to
process for that thread we will unnecessarily delay it.
Please provide your inputs regarding what's the best way to handle the
rules engine.
To me, letting the rules live in separate module/modules seems to be a
reasonable way to move forward which will provide reliable and timely
execution of rules and by making it modular, we can remove most of the
complexity from core kernel code.
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists