[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180504142056.GA26151@redhat.com>
Date: Fri, 4 May 2018 16:20:57 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Kirill Tkhai <ktkhai@...tuozzo.com>, akpm@...ux-foundation.org,
peterz@...radead.org, viro@...iv.linux.org.uk, mingo@...nel.org,
paulmck@...ux.vnet.ibm.com, keescook@...omium.org, riel@...hat.com,
tglx@...utronix.de, kirill.shutemov@...ux.intel.com,
marcos.souza.org@...il.com, hoeun.ryu@...il.com,
pasha.tatashin@...cle.com, gs051095@...il.com, dhowells@...hat.com,
rppt@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg
On 05/03, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@...hat.com> writes:
>
> > On 05/02, Eric W. Biederman wrote:
> >>
> >> +static void mem_cgroup_fork(struct task_struct *tsk)
> >> +{
> >> + struct cgroup_subsys_state *css;
> >> +
> >> + rcu_read_lock();
> >> + css = task_css(tsk, memory_cgrp_id);
> >> + if (css && css_tryget(css))
> >> + task_update_memcg(tsk, mem_cgroup_from_css(css));
> >> + rcu_read_unlock();
> >> +}
> >
> > Why do we need it?
> >
> > The child's mm->memcg was already initialized by mm_init_memcg() and we can't
> > race with migrate until cgroup_threadgroup_change_end() ?
>
> I admit I missed the cgroup_threadgroup_change_begin
> cgroup_threadgroup_change_end pair in fs fork. In this case it doesn't
> matter because mm_init_memcg is called from:
>
> copy_mm
> dup_mm
> mm_init
>
> And copy_mm is called before we call cgroup_threadgroup_change_begin.
> So the race remains.
Ah yes, you are right.
> We could move move cgroup_threadgroup_change_begin earlier, to remove
> the need for mem_cgroup_fork. But I have not analyzed that.
No, cgroup_threadgroup_change_begin() was called early and this was wrong, see
568ac888215c7fb2fabe8ea739b00ec3c1f5d440. Actually there were more problems, say
copy_net() could deadlock because cleanup_net() does do_wait() with net_mutex held.
OK, what about exec() ? mm_init_memcg() initializes bprm->mm->memcg early in
bprm_mm_init(). What if the execing task migrates before exec_mmap() ?
Oleg.
Powered by blists - more mailing lists