lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7bf5372d-7d9d-abee-27dd-5044da5ec489@virtuozzo.com>
Date:   Tue, 24 Apr 2018 14:38:51 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Vladimir Davydov <vdavydov.dev@...il.com>
Cc:     akpm@...ux-foundation.org, shakeelb@...gle.com,
        viro@...iv.linux.org.uk, hannes@...xchg.org, mhocko@...nel.org,
        tglx@...utronix.de, pombredanne@...b.com, stummala@...eaurora.org,
        gregkh@...uxfoundation.org, sfr@...b.auug.org.au, guro@...com,
        mka@...omium.org, penguin-kernel@...ove.SAKURA.ne.jp,
        chris@...is-wilson.co.uk, longman@...hat.com, minchan@...nel.org,
        hillf.zj@...baba-inc.com, ying.huang@...el.com,
        mgorman@...hsingularity.net, jbacik@...com, linux@...ck-us.net,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        willy@...radead.org, lirongqing@...du.com, aryabinin@...tuozzo.com
Subject: Re: [PATCH v2 04/12] mm: Assign memcg-aware shrinkers bitmap to memcg

On 24.04.2018 14:28, Vladimir Davydov wrote:
> On Mon, Apr 23, 2018 at 01:54:50PM +0300, Kirill Tkhai wrote:
>>>> @@ -1200,6 +1206,8 @@ extern int memcg_nr_cache_ids;
>>>>  void memcg_get_cache_ids(void);
>>>>  void memcg_put_cache_ids(void);
>>>>  
>>>> +extern int shrinkers_max_nr;
>>>> +
>>>
>>> memcg_shrinker_id_max?
>>
>> memcg_shrinker_id_max sounds like an includive value, doesn't it?
>> While shrinker->id < shrinker_max_nr.
>>
>> Let's better use memcg_shrinker_nr_max.
> 
> or memcg_nr_shrinker_ids (to match memcg_nr_cache_ids), not sure...
> 
> Come to think of it, this variable is kinda awkward: it is defined in
> vmscan.c but declared in memcontrol.h; it is used by vmscan.c for max
> shrinker id and by memcontrol.c for shrinker map capacity. Just a raw
> idea: what about splitting it in two: one is private to vmscan.c, used
> as max id, say we call it shrinker_id_max; the other is defined in
> memcontrol.c and is used for shrinker map capacity, say we call it
> memcg_shrinker_map_capacity. What do you think?

I don't much like a duplication of the single variable... Are there real
problems, if it defined in memcontrol.{c,h} and use in both of the places?
 
>>>> +int expand_shrinker_maps(int old_nr, int nr)
>>>> +{
>>>> +	int id, size, old_size, node, ret;
>>>> +	struct mem_cgroup *memcg;
>>>> +
>>>> +	old_size = old_nr / BITS_PER_BYTE;
>>>> +	size = nr / BITS_PER_BYTE;
>>>> +
>>>> +	down_write(&shrinkers_max_nr_rwsem);
>>>> +	for_each_node(node) {
>>>
>>> Iterating over cgroups first, numa nodes second seems like a better idea
>>> to me. I think you should fold for_each_node in memcg_expand_maps.
>>>
>>>> +		idr_for_each_entry(&mem_cgroup_idr, memcg, id) {
>>>
>>> Iterating over mem_cgroup_idr looks strange. Why don't you use
>>> for_each_mem_cgroup?
>>
>> We want to allocate shrinkers maps in mem_cgroup_css_alloc(), since
>> mem_cgroup_css_online() mustn't fail (it's a requirement of currently
>> existing design of memcg_cgroup::id).
>>
>> A new memcg is added to parent's list between two of these calls:
>>
>> css_create()
>>   ss->css_alloc()
>>   list_add_tail_rcu(&css->sibling, &parent_css->children)
>>   ss->css_online()
>>
>> for_each_mem_cgroup() does not see allocated, but not linked children.
> 
> Why don't we move shrinker map allocation to css_online then?

Because the design of memcg_cgroup::id prohibits mem_cgroup_css_online() to fail.
This function can't fail.

I don't think it will be good to dive into reworking of this stuff for this patchset,
which is really already big. Also, it will be assymmetric to allocate one part of
data in css_alloc(), while another data in css_free(). This breaks cgroup design,
which specially introduces this two function to differ allocation and onlining.
Also, I've just move the allocation to alloc_mem_cgroup_per_node_info() like it was
suggested in comments to v1...

>>  
>>>> +			if (id == 1)
>>>> +				memcg = NULL;
>>>> +			ret = memcg_expand_maps(memcg, node, size, old_size);
>>>> +			if (ret)
>>>> +				goto unlock;
>>>> +		}
>>>> +
>>>> +		/* root_mem_cgroup is not initialized yet */
>>>> +		if (id == 0)
>>>> +			ret = memcg_expand_maps(NULL, node, size, old_size);
>>>> +	}
>>>> +unlock:
>>>> +	up_write(&shrinkers_max_nr_rwsem);
>>>> +	return ret;
>>>> +}

Kirill

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ