linux-kernel - Re: [PATCH] memcg: remove KMEM_ACCOUNTED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <529F1883.3030907@parallels.com>
Date:	Wed, 4 Dec 2013 15:56:51 +0400
From:	Vladimir Davydov <vdavydov@...allels.com>
To:	Glauber Costa <glommer@...il.com>
CC:	Michal Hocko <mhocko@...e.cz>, LKML <linux-kernel@...r.kernel.org>,
	<cgroups@...r.kernel.org>, <devel@...nvz.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Balbir Singh <bsingharora@...il.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

On 12/04/2013 02:08 PM, Glauber Costa wrote:
>>> Could you do something clever with just one flag? Probably yes. But I
>>> doubt it would
>>> be that much cleaner, this is just the way that patching sites work.
>> Thank you for spending your time to listen to me.
>>
> Don't worry! I thank you for carrying this forward.
>
>> Let me try to explain what is bothering me.
>>
>> We have two state bits for each memcg, 'active' and 'activated'. There
>> are two scenarios where the bits can be modified:
>>
>> 1) The kmem limit is set on a memcg for the first time -
>> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
>> which sets the 'activated' bit on success, then update static branching,
>> then set the 'active' bit. All three actions are done atomically in
>> respect to other tasks setting the limit due to the set_limit_mutex.
>> After both bits are set, they never get cleared for the memcg.
>>
> So far so good. But again, note how you yourself describe it:
> the cations are done atomically  *in respect to other tasks setting the limit*
>
> But there are also tasks that are running its courses naturally and
> just allocating
> memory. For those, some call sites will be on, some will be off. We need to make
> sure that *none* of them uses the patched site until *all* of them are patched.
> This has nothing to do with updates, this is all about the readers.
>
>> 2) When a subgroup of a kmem-active cgroup is created -
>> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
>> then increase static branching refcounter, then call
>> memcg_update_cache_sizes() for the new memcg, which may clear the
>> 'activated' bit on failure. After successful execution, the state bits
>> never get cleared for the new memcg.
>>
>> In scenario 2 there is no need bothering about the flags setting order,
>> because we don't have any tasks in the cgroup yet - the tasks can be
>> moved in only after css_online finishes when we have both of the bits
>> set and the static branching enabled. Actually, we already do not bother
>> about it, because we have both bits set before the cgroup is fully
>> initialized (memcg_update_cache_sizes() is called).
>>
> Yes, after the first cgroup is set none of that matters. But it is just easier
> and less error prone just to follow the same path every time. As I have said,
> if you can come up with a more clever way to deal with the problem above
> that doesn't involve the double flag - and you can prove it works - I
> am definitely
> fine with it. But this is subtle code, and in the past - Michal can
> attest this - we've
> changed this being sure it would work just to see it explode in our faces.
>
> So although I am willing to review every patch for correctness on that
> front (I never
> said I liked the 2-flags scheme...), unless you have a bug or real
> problem on it,
> I would advise against changing it if its just to make it more readable.
>
> But again, don't take me too seriously on this. If you and Michal think you can
> come up with something better, I'm all for it.

All right, I finally get you :-)

Although I still don't think we need the second flag, I now understand
that it's better not to change the code that works fine especially the
change does not make it neither more readable nor more effective. Since
I can be mistaken about the flags usage (it's by far not unlikely), it's
better to leave it as is rather than being at risk of catching spurious
hangs that might be caused by this modification.

Thanks for the detailed explanation!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/