[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FC4F415.30007@parallels.com>
Date: Tue, 29 May 2012 20:06:45 +0400
From: Glauber Costa <glommer@...allels.com>
To: Christoph Lameter <cl@...ux.com>
CC: <linux-kernel@...r.kernel.org>, <cgroups@...r.kernel.org>,
<linux-mm@...ck.org>, <kamezawa.hiroyu@...fujitsu.com>,
Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
Greg Thelen <gthelen@...gle.com>,
Suleiman Souhlal <suleiman@...gle.com>,
Michal Hocko <mhocko@...e.cz>,
Johannes Weiner <hannes@...xchg.org>, <devel@...nvz.org>,
David Rientjes <rientjes@...gle.com>,
Pekka Enberg <penberg@...helsinki.fi>
Subject: Re: [PATCH v3 18/28] slub: charge allocation to a memcg
On 05/29/2012 06:51 PM, Christoph Lameter wrote:
> On Fri, 25 May 2012, Glauber Costa wrote:
>
>> This patch charges allocation of a slab object to a particular
>> memcg.
>
> I am wondering why you need all the other patches. The simplest approach
> would just to hook into page allocation and freeing from the slab
> allocators as done here and charge to the currently active cgroup. This
> avoids all the duplication of slab caches and per node as well as per cpu
> structures. A certain degree of fuzziness cannot be avoided given that
> objects are cached and may be served to multiple cgroups. If that can be
> tolerated then the rest would be just like this patch which could be made
> more simple and non intrusive.
>
Just hooking into the page allocation only works for caches with very
big objects. For all the others, we need to relay the process to the
correct cache.
Some objects may be shared, yes, but in reality most won't.
Let me give you an example:
We track task_struct here. So as a nice side effect of this, a fork bomb
will be killed because it will not be able to allocate any further.
But if we're accounting only at page allocation time, it is quite
possible to come up with a pattern while I always let other cgroups
pay the price for the page, but I will be the one filling it.
Having an eventual dentry, for instance, shared among caches, is okay.
But the main use case is for process in different cgroups dealing with
totally different parts of the filesystem.
So we can't really afford to charge to the process touching the nth
object where n is the number of objects per page. We need to relay it to
the right one.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists