lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 04 May 2018 09:36:11 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>, akpm@...ux-foundation.org,
        peterz@...radead.org, viro@...iv.linux.org.uk, mingo@...nel.org,
        paulmck@...ux.vnet.ibm.com, keescook@...omium.org, riel@...hat.com,
        tglx@...utronix.de, kirill.shutemov@...ux.intel.com,
        marcos.souza.org@...il.com, hoeun.ryu@...il.com,
        pasha.tatashin@...cle.com, gs051095@...il.com, dhowells@...hat.com,
        rppt@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
        Balbir Singh <balbir@...ux.vnet.ibm.com>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg

Oleg Nesterov <oleg@...hat.com> writes:

> On 05/03, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > On 05/02, Eric W. Biederman wrote:
>> >>
>> >> +static void mem_cgroup_fork(struct task_struct *tsk)
>> >> +{
>> >> +	struct cgroup_subsys_state *css;
>> >> +
>> >> +	rcu_read_lock();
>> >> +	css = task_css(tsk, memory_cgrp_id);
>> >> +	if (css && css_tryget(css))
>> >> +		task_update_memcg(tsk, mem_cgroup_from_css(css));
>> >> +	rcu_read_unlock();
>> >> +}
>> >
>> > Why do we need it?
>> >
>> > The child's mm->memcg was already initialized by mm_init_memcg() and we can't
>> > race with migrate until cgroup_threadgroup_change_end() ?
>>
>> I admit I missed the cgroup_threadgroup_change_begin
>> cgroup_threadgroup_change_end pair in fs fork.  In this case it doesn't
>> matter because mm_init_memcg is called from:
>>
>>    copy_mm
>>       dup_mm
>>         mm_init
>>
>> And copy_mm is called before we call cgroup_threadgroup_change_begin.
>> So the race remains.
>
> Ah yes, you are right.
>
>> We could move move cgroup_threadgroup_change_begin earlier, to remove
>> the need for mem_cgroup_fork.  But I have not analyzed that.
>
> No, cgroup_threadgroup_change_begin() was called early and this was wrong, see
> 568ac888215c7fb2fabe8ea739b00ec3c1f5d440. Actually there were more problems, say
> copy_net() could deadlock because cleanup_net() does do_wait() with net_mutex held.
>
>
> OK, what about exec() ? mm_init_memcg() initializes bprm->mm->memcg early in
> bprm_mm_init(). What if the execing task migrates before exec_mmap() ?

We need the the cgroup when the mm is initialized.  That way we have the
cgroup information when initializing the mm.

I don't know if a lock preventing changing the cgroup in exec or just
a little bit of code in exec_mmap to ensure mm->memcg is properly set
is the better approach.  I have not analyzed that code path.

This does look like a very good place for an incremental patch to close
that race.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ