linux-kernel - Re: [PATCH] memcg: remove KMEM_ACCOUNTED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <529D9100.4070207@parallels.com>
Date:	Tue, 3 Dec 2013 12:06:24 +0400
From:	Vladimir Davydov <vdavydov@...allels.com>
To:	Glauber Costa <glommer@...il.com>
CC:	Michal Hocko <mhocko@...e.cz>, LKML <linux-kernel@...r.kernel.org>,
	<cgroups@...r.kernel.org>, <devel@...nvz.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Balbir Singh <bsingharora@...il.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

On 12/03/2013 11:56 AM, Glauber Costa wrote:
> On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
> <vdavydov@...allels.com> wrote:
>> On 12/02/2013 10:26 PM, Glauber Costa wrote:
>>> On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko <mhocko@...e.cz> wrote:
>>>> [CCing Glauber - please do so in other posts for kmem related changes]
>>>>
>>>> On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
>>>>> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
>>>>> use static branches when code not in use") in order to guarantee that
>>>>> static_key_slow_inc(&memcg_kmem_enabled_key) will be called only once
>>>>> for each memory cgroup when its kmem limit is set. The point is that at
>>>>> that time the memcg_update_kmem_limit() function's workflow looked like
>>>>> this:
>>>>>
>>>>>        bool must_inc_static_branch = false;
>>>>>
>>>>>        cgroup_lock();
>>>>>        mutex_lock(&set_limit_mutex);
>>>>>        if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
>>>>>                /* The kmem limit is set for the first time */
>>>>>                ret = res_counter_set_limit(&memcg->kmem, val);
>>>>>
>>>>>                memcg_kmem_set_activated(memcg);
>>>>>                must_inc_static_branch = true;
>>>>>        } else
>>>>>                ret = res_counter_set_limit(&memcg->kmem, val);
>>>>>        mutex_unlock(&set_limit_mutex);
>>>>>        cgroup_unlock();
>>>>>
>>>>>        if (must_inc_static_branch) {
>>>>>                /* We can't do this under cgroup_lock */
>>>>>                static_key_slow_inc(&memcg_kmem_enabled_key);
>>>>>                memcg_kmem_set_active(memcg);
>>>>>        }
>>>>>
>>>>> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
>>>>> static_key_slow_inc() is called under the set_limit_mutex, but the
>>>>> leftover from the above-mentioned commit is still here. Let's remove it.
>>>> OK, so I have looked there again and 692e89abd154b (memcg: increment
>>>> static branch right after limit set) which went in after cgroup_mutex
>>>> has been removed. It came along with the following comment.
>>>>                  /*
>>>>                   * setting the active bit after the inc will guarantee
>>>> no one
>>>>                   * starts accounting before all call sites are patched
>>>>                   */
>>>>
>>>> This suggests that the flag is needed after all because we have
>>>> to be sure that _all_ the places have to be patched. AFAIU
>>>> memcg_kmem_newpage_charge might see the static key already patched so
>>>> it would do a charge but memcg_kmem_commit_charge would still see it
>>>> unpatched and so the charge won't be committed.
>>>>
>>>> Or am I missing something?
>>> You are correct. This flag is there due to the way we are using static
>>> branches.
>>> The patching of one call site is atomic, but the patching of all of
>>> them are not.
>>> Therefore we need to use a two-flag scheme to guarantee that in the first
>>> time
>>> we turn the static branches on, there will be a clear point after
>>> which we're going
>>> to start accounting.
>>
>> Hi, Glauber
>>
>> Sorry, but I don't understand why we need two flags. Isn't checking the flag
>> set after all call sites have been patched (I mean KMEM_ACCOUNTED_ACTIVE)
>> not enough?
> Take a look at net/ipv4/tcp_memcontrol.c. There are comprehensive comments there
> for a mechanism that basically achieves the same thing. The idea is
> that one flag is used
> at all times and means "it is enabled". The second flags is a one time
> only flag to indicate
> that the patching process is complete. With one flag it seems to work,
> but it is racy.

AFAIU, the point of using two flags in tcp_update_limit() is that we set
the limit and update static branching lockless so the 'activated' flag
is needed there in order to make sure only one process will call
static_key_slow_inc() in case there are concurrent processes setting the
limit. The comment there confirms my assumption:

         * The activated bit is used to guarantee that no two writers
         * will do the update in the same memcg. Without that, we can't
         * properly shutdown the static key.
         */
        if (!test_and_set_bit(MEMCG_SOCK_ACTIVATED, &cg_proto->flags))
            static_key_slow_inc(&memcg_socket_limit_enabled);
        set_bit(MEMCG_SOCK_ACTIVE, &cg_proto->flags);

In memcg_update_kmem_limit() we do the whole process of limit
initialization under a mutex so the situation we need protection from in
tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
never cleared and never checked alone, only along with the 'active'
flag, that's why I doubt we need it at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/