lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a81c801-35c8-767d-54b0-df9f1ca0abc0@virtuozzo.com>
Date:   Fri, 13 Apr 2018 14:49:32 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org,
        vdavydov.dev@...il.com, cgroups@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] memcg: Remove memcg_cgroup::id from IDR on
 mem_cgroup_css_alloc() failure

On 13.04.2018 14:38, Michal Hocko wrote:
> On Fri 13-04-18 14:29:11, Kirill Tkhai wrote:
>> On 13.04.2018 14:20, Michal Hocko wrote:
>>> On Fri 13-04-18 14:06:40, Kirill Tkhai wrote:
>>>> On 13.04.2018 14:02, Michal Hocko wrote:
>>>>> On Fri 13-04-18 12:35:22, Kirill Tkhai wrote:
>>>>>> On 13.04.2018 11:55, Michal Hocko wrote:
>>>>>>> On Thu 12-04-18 17:52:04, Kirill Tkhai wrote:
>>>>>>> [...]
>>>>>>>> @@ -4471,6 +4477,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>>>>>>>>  
>>>>>>>>  	return &memcg->css;
>>>>>>>>  fail:
>>>>>>>> +	mem_cgroup_id_remove(memcg);
>>>>>>>>  	mem_cgroup_free(memcg);
>>>>>>>>  	return ERR_PTR(-ENOMEM);
>>>>>>>>  }
>>>>>>>
>>>>>>> The only path which jumps to fail: here (in the current mmotm tree) is 
>>>>>>> 	error = memcg_online_kmem(memcg);
>>>>>>> 	if (error)
>>>>>>> 		goto fail;
>>>>>>>
>>>>>>> AFAICS and the only failure path in memcg_online_kmem
>>>>>>> 	memcg_id = memcg_alloc_cache_id();
>>>>>>> 	if (memcg_id < 0)
>>>>>>> 		return memcg_id;
>>>>>>>
>>>>>>> I am not entirely clear on memcg_alloc_cache_id but it seems we do clean
>>>>>>> up properly. Or am I missing something?
>>>>>>
>>>>>> memcg_alloc_cache_id() may allocate a lot of memory, in case of the system reached
>>>>>> memcg_nr_cache_ids cgroups. In this case it iterates over all LRU lists, and double
>>>>>> size of every of them. In case of memory pressure it can fail. If this occurs,
>>>>>> mem_cgroup::id is not unhashed from IDR and we leak this id.
>>>>>
>>>>> OK, my bad I was looking at the bad code path. So you want to clean up
>>>>> after mem_cgroup_alloc not memcg_online_kmem. Now it makes much more
>>>>> sense. Sorry for the confusion on my end.
>>>>>
>>>>> Anyway, shouldn't we do the thing in mem_cgroup_free() to be symmetric
>>>>> to mem_cgroup_alloc?
>>>>
>>>> We can't, since it's called from mem_cgroup_css_free(), which doesn't have a deal
>>>> with idr freeing. All the asymmetry, we see, is because of the trick to unhash ID
>>>> earlier, then from mem_cgroup_css_free().
>>>
>>> Are you sure. It's been some time since I've looked at the quite complex
>>> cgroup tear down code but from what I remember, css_free is called on
>>> the css release (aka when the reference count drops to zero). mem_cgroup_id_put_many
>>> seems to unpin the css reference so we should have idr_remove by the
>>> time when css_free is called. Or am I still wrong and should go over the
>>> brain hurting cgroup removal code again?
>>
>> mem_cgroup_id_put_many() unpins css, but this may be not the last reference to the css.
>> Thus, we release ID earlier, then all references to css are freed.
> 
> Right and so what. If we have released the idr then we are not going to
> do that again in css_free. That is why we have that memcg->id.id > 0
> check before idr_remove and memcg->id.id = 0 for the last memcg ref.
> count. So again, why cannot we do the clean up in mem_cgroup_free and
> have a less confusing code? Or am I just not getting your point and
> being dense here?

We can, but mem_cgroup_free() called from mem_cgroup_css_alloc() is unlikely case.
The likely case is mem_cgroup_free() is called from mem_cgroup_css_free(), where
this idr manipulations will be a noop. Noop in likely case looks more confusing
for me.

Less confusing will be to move

        memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
                                 1, MEM_CGROUP_ID_MAX,
                                 GFP_KERNEL);

into mem_cgroup_css_alloc(). How are you think about this?

Kirill

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ