lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180601140701.GF15278@dhcp22.suse.cz>
Date:   Fri, 1 Jun 2018 16:07:01 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>, peterz@...radead.org,
        viro@...iv.linux.org.uk, mingo@...nel.org,
        paulmck@...ux.vnet.ibm.com, keescook@...omium.org, riel@...hat.com,
        tglx@...utronix.de, kirill.shutemov@...ux.intel.com,
        marcos.souza.org@...il.com, hoeun.ryu@...il.com,
        pasha.tatashin@...cle.com, gs051095@...il.com, dhowells@...hat.com,
        rppt@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
        Balbir Singh <balbir@...ux.vnet.ibm.com>,
        Tejun Heo <tj@...nel.org>, Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH 0/2] mm->owner to mm->memcg fixes

On Thu 31-05-18 13:41:38, Eric W. Biederman wrote:
> Michal Hocko <mhocko@...nel.org> writes:
> 
> > On Thu 24-05-18 14:16:35, Andrew Morton wrote:
> >> On Thu, 24 May 2018 13:10:02 +0200 Michal Hocko <mhocko@...nel.org> wrote:
> >> 
> >> > I would really prefer and appreciate a repost with all the fixes folded
> >> > in.
> >> 
> >> [1/2]
> >
> > Thanks Andrew and sorry it took so long! This seems to be missing the
> > fix for the issue I've mentioned in http://lkml.kernel.org/r/20180510121838.GE5325@dhcp22.suse.cz
> > and Eric wanted to handle by http://lkml.kernel.org/r/87wovu889o.fsf@xmission.com.
> > I do not think that this fix is really correct one though. I will
> > comment in a reply to that email.
> 
> Agreed.  That was not a correct fix to the possible infinite loop in
> get_mem_cgroup_from_mm.   Which in net leaves all of this broken and
> not ready to be merged.
> 
> > In any case I really think this should better be reposted in one patch
> > series and restart the discussion. I strongly suggest revisiting my
> > previous attempt http://lkml.kernel.org/r/1436358472-29137-8-git-send-email-mhocko@kernel.org
> > including the follow up discussion regarding the unfortunate CLONE_VM
> > outside of thread group case and potential solution for that.
> 
> With my fix for get_mem_cgroup_from_mm falling flat that limits our
> possible future directions:
> - Make memory cgroups honest and require all tasks of an mm to be in the
>   same memory cgroup.  [BEST]

Agreed. This should be possible with the multi-pid migration support.
Just do not allow to migrate if the set of pids doesn't contain all
processes sharing the same mm. We can add a special MMF_ flag for mm
that is generated by CLONE_VM without CLONE_THREAD to make the normal
path fast.

Or we can simply fail the migration for this odd cases and implement it
only if somebody actually complains. The flag would be useful for other
cases (e.g. in the oom path where we have to handle these tasks as well).

[...]
> >> -	/* We move charges only when we move a owner of the mm */
> >> -	if (mm->owner == p) {
> c>> +
> >> +	/* We move charges except for creative uses of CLONE_VM */
> >> +	if (mm->memcg == from) {
> >
> > I must be missing something here but how do you prevent those cases?
> > AFAICS processes sharing the mm will simply allow to migrate.
> 
> The mm will only migrate once per set of tasks.  A task that does not
> migrate with the mm will not have mm->memcg == from.  Which is all I was
> referring to.  Perhaps I did not say that well.

OK, I see now (I guess). I would probably just drop the comment. The
code is much more clear when dealing with the memcg than a task...

> >>  		VM_BUG_ON(mc.from);
> >>  		VM_BUG_ON(mc.to);
> >>  		VM_BUG_ON(mc.precharge);
> >
> > Other than that the patch makes sense to me.
> 
> I will take a bit and respin this.  Because of algorithmic bug-fix
> nature of where this started I want to avoid depending on fixing the
> semantics.  With my third option for fixing get_mem_cgroup_from_mm I see
> how to do that now.  Then I will include a separate patch to fix the
> semantics.  

Thanks!
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ