lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <830e05f4-5105-be40-6414-9a90b610d5cc@virtuozzo.com>
Date:   Tue, 27 Mar 2018 18:17:31 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Vladimir Davydov <vdavydov.dev@...il.com>
Cc:     viro@...iv.linux.org.uk, hannes@...xchg.org, mhocko@...nel.org,
        akpm@...ux-foundation.org, tglx@...utronix.de,
        pombredanne@...b.com, stummala@...eaurora.org,
        gregkh@...uxfoundation.org, sfr@...b.auug.org.au, guro@...com,
        mka@...omium.org, penguin-kernel@...ove.SAKURA.ne.jp,
        chris@...is-wilson.co.uk, longman@...hat.com, minchan@...nel.org,
        hillf.zj@...baba-inc.com, ying.huang@...el.com,
        mgorman@...hsingularity.net, shakeelb@...gle.com, jbacik@...com,
        linux@...ck-us.net, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, willy@...radead.org
Subject: Re: [PATCH 03/10] mm: Assign memcg-aware shrinkers bitmap to memcg

On 27.03.2018 13:00, Vladimir Davydov wrote:
> On Mon, Mar 26, 2018 at 06:29:05PM +0300, Kirill Tkhai wrote:
>>>> @@ -182,6 +187,9 @@ struct mem_cgroup {
>>>>  	unsigned long low;
>>>>  	unsigned long high;
>>>>  
>>>> +	/* Bitmap of shrinker ids suitable to call for this memcg */
>>>> +	struct shrinkers_map __rcu *shrinkers_map;
>>>> +
>>>
>>> We keep all per-node data in mem_cgroup_per_node struct. I think this
>>> bitmap should be defined there as well.
>>
>> But them we'll have to have struct rcu_head for every node to free the map
>> via rcu. This is the only reason I did that. But if you think it's not a problem,
>> I'll agree with you.
> 
> I think it's OK. It'd be consistent with how list_lru handles
> list_lru_memcg reallocations.
> 
>>>> @@ -4487,6 +4490,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
>>>>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>>>>  	struct mem_cgroup_event *event, *tmp;
>>>>  
>>>> +	free_shrinker_maps(memcg);
>>>> +
>>>
>>> AFAIU this can race with shrink_slab accessing the map, resulting in
>>> use-after-free. IMO it would be safer to free the bitmap from css_free.
>>
>> But doesn't shrink_slab() iterate only online memcg?
> 
> Well, yes, shrink_slab() bails out if the memcg is offline, but I
> suspect there might be a race condition between shrink_slab and
> css_offline when shrink_slab calls shrinkers for an offline cgroup.
> 
>>
>>>>  	/*
>>>>  	 * Unregister events and notify userspace.
>>>>  	 * Notify userspace about cgroup removing only after rmdir of cgroup
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index 97ce4f342fab..9d1df5d90eca 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -165,6 +165,10 @@ static DECLARE_RWSEM(bitmap_rwsem);
>>>>  static int bitmap_id_start;
>>>>  static int bitmap_nr_ids;
>>>>  static struct shrinker **mcg_shrinkers;
>>>> +struct shrinkers_map *__rcu root_shrinkers_map;
>>>
>>> Why do you need root_shrinkers_map? AFAIR the root memory cgroup doesn't
>>> have kernel memory accounting enabled.
>> But we can charge the corresponding lru and iterate it over global reclaim,
>> don't we?
> 
> Yes, I guess you're right. But do we need to care about it? Would it be
> OK if we iterated over all shrinkers for the root cgroup? Dunno...

In case of 2000 shrinkers, this will flush the cache. This is the reason :)

> Anyway, please try to handle the root cgroup consistently with other
> cgroups. I mean, nothing like this root_shrinkers_map should exist.
> It should be either a part of root_mem_cgroup or we should iterate over
> all shrinkers for the root cgroup.

It's not possible. root_mem_cgroup does not exist always. Even if CONFIG_MEMCG
is enabled, memcg may be prohibited by boot params. 

In case of it's not prohibited, there are some shrinkers, which are registered
before it's initialized, while memory_cgrp_subsys can't has .early_init = 1.

>>
>> struct list_lru_node {
>> 	...
>>         /* global list, used for the root cgroup in cgroup aware lrus */
>>         struct list_lru_one     lru;
>> 	...
>> };

Kirill

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ