linux-kernel - Re: [RFD] Merge task counter into memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120412010745.GE1787@cmpxchg.org>
Date:	Thu, 12 Apr 2012 03:07:45 +0200
From:	Johannes Weiner <hannes@...xchg.org>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Hugh Dickins <hughd@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Glauber Costa <glommer@...allels.com>,
	Tejun Heo <tj@...nel.org>, Daniel Walsh <dwalsh@...hat.com>,
	"Daniel P. Berrange" <berrange@...hat.com>,
	Li Zefan <lizf@...fujitsu.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Cgroups <cgroups@...r.kernel.org>,
	Containers <containers@...ts.linux-foundation.org>
Subject: Re: [RFD] Merge task counter into memcg

On Wed, Apr 11, 2012 at 08:57:20PM +0200, Frederic Weisbecker wrote:
> Hi,
> 
> While talking with Tejun about targetting the cgroup task counter subsystem
> for the next merge window, he suggested to check if this could be merged into
> the memcg subsystem rather than creating a new one cgroup subsystem just
> for task count limit purpose.
> 
> So I'm pinging you guys to seek your insight.

I'm sorry you are given a runaround like this with that code.

> I assume not everybody in the Cc list knows what the task counter subsystem
> is all about. So here is a summary: this is a cgroup subsystem (latest version
> in https://lwn.net/Articles/478631/) that keeps track of the number of tasks
> present in a cgroup. Hooks are set in task fork/exit and cgroup migration to
> maintain this accounting visible to a special tasks.usage file. The user can
> set a limit on the number of tasks by writing on the tasks.limit file.
> Further forks or cgroup migration are then rejected if the limit is exceeded.
> 
> This feature is especially useful to protect against forkbombs in containers.
> Or more generally to limit the resources on the number of tasks on a cgroup
> as it involves some kernel memory allocation.

You could also twist this around and argue the same for cpu usage and
make it part of the cpu cgroup, but it doesn't really fit in either
subsystem, IMO.

> Now the dilemna is how to implement it?
> 
> 1) As a standalone subsystem, as it stands currently (https://lwn.net/Articles/478631/)

What was wrong with that again?

> 2) As a feature in memcg, part of the memory.kmem.* files. This makes sense
> because this is about kernel memory allocation limitation. We could have a
> memory.kmem.tasks.count
> 
> My personal opinion is that the task counter brings some overhead: a charge
> across the whole hierarchy at every fork, and the mirrored uncharge on task exit.
> And this overhead happens even in the off-case (when the task counter susbsystem
> is mounted but the limit is the default: ULLONG_MAX).

3) Make it an integral part of cgroups, because keeping track of tasks
in them already is, so it would be a more natural approach than
bolting it onto the memory controller.

But this has the same overhead.  And even if this would end up being a
better idea, we could still do this after merging it as a separate
controller as long as we maintain the interface.

> So if we choose the second solution, this overhead will be added unconditionally
> to memcg.
> But I don't expect every users of memcg will need the task counter. So perhaps
> the overhead should be kept in its own separate subsystem.
> 
> OTOH memory.kmem.* interface would have be a good fit.
> 
> What do you think?

Instead of integrating it task-wise, could the problem be solved by
accounting the kernel stack to kmem?  And then have a kmem limit,
which we already want anway?

After all, we would only restrict the number of tasks for the
resources they require, not to only allow an arbitrary number of tasks
(unless one wants to sell Windows 7 Starter style containers, in which
case one can go play with oneself out of tree as far as I'm concerned)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/