[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E7A342B.5040608@parallels.com>
Date: Wed, 21 Sep 2011 15:59:55 -0300
From: Glauber Costa <glommer@...allels.com>
To: Greg Thelen <gthelen@...gle.com>
CC: <linux-kernel@...r.kernel.org>, <paul@...lmenage.org>,
<lizf@...fujitsu.com>, <kamezawa.hiroyu@...fujitsu.com>,
<ebiederm@...ssion.com>, <davem@...emloft.net>,
<netdev@...r.kernel.org>, <linux-mm@...ck.org>,
<kirill@...temov.name>
Subject: Re: [PATCH v3 2/7] socket: initial cgroup code.
On 09/21/2011 03:47 PM, Greg Thelen wrote:
> On Sun, Sep 18, 2011 at 5:56 PM, Glauber Costa<glommer@...allels.com> wrote:
>> We aim to control the amount of kernel memory pinned at any
>> time by tcp sockets. To lay the foundations for this work,
>> this patch adds a pointer to the kmem_cgroup to the socket
>> structure.
>>
>> Signed-off-by: Glauber Costa<glommer@...allels.com>
>> CC: David S. Miller<davem@...emloft.net>
>> CC: Hiroyouki Kamezawa<kamezawa.hiroyu@...fujitsu.com>
>> CC: Eric W. Biederman<ebiederm@...ssion.com>
> ...
>> +void sock_update_memcg(struct sock *sk)
>> +{
>> + /* right now a socket spends its whole life in the same cgroup */
>> + BUG_ON(sk->sk_cgrp);
>> +
>> + rcu_read_lock();
>> + sk->sk_cgrp = mem_cgroup_from_task(current);
>> +
>> + /*
>> + * We don't need to protect against anything task-related, because
>> + * we are basically stuck with the sock pointer that won't change,
>> + * even if the task that originated the socket changes cgroups.
>> + *
>> + * What we do have to guarantee, is that the chain leading us to
>> + * the top level won't change under our noses. Incrementing the
>> + * reference count via cgroup_exclude_rmdir guarantees that.
>> + */
>> + cgroup_exclude_rmdir(mem_cgroup_css(sk->sk_cgrp));
>
> This grabs a css_get() reference, which prevents rmdir (will return
> -EBUSY).
Yes.
How long is this reference held?
For the socket lifetime.
> I wonder about the case
> where a process creates a socket in memcg M1 and later is moved into
> memcg M2. At that point an admin would expect to be able to 'rmdir
> M1'. I think this rmdir would return -EBUSY and I suspect it would be
> difficult for the admin to understand why the rmdir of M1 failed. It
> seems that to rmdir a memcg, an admin would have to kill all processes
> that allocated sockets while in M1. Such processes may not still be
> in M1.
>
>> + rcu_read_unlock();
>> +}
I agree. But also, don't see too much ways around it without
implementing full task migration.
Right now I am working under the assumption that tasks are long lived
inside the cgroup. Migration potentially introduces some nasty locking
problems in the mem_schedule path.
Also, unless I am missing something, the memcg already has the policy of
not carrying charges around, probably because of this very same complexity.
True that at least it won't EBUSY you... But I think this is at least a
way to guarantee that the cgroup under our nose won't disappear in the
middle of our allocations.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists