lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87vabyvnw0.fsf@xmission.com>
Date:   Mon, 07 May 2018 22:15:27 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>, akpm@...ux-foundation.org,
        peterz@...radead.org, viro@...iv.linux.org.uk, mingo@...nel.org,
        paulmck@...ux.vnet.ibm.com, keescook@...omium.org, riel@...hat.com,
        tglx@...utronix.de, kirill.shutemov@...ux.intel.com,
        marcos.souza.org@...il.com, hoeun.ryu@...il.com,
        pasha.tatashin@...cle.com, gs051095@...il.com, dhowells@...hat.com,
        rppt@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
        Balbir Singh <balbir@...ux.vnet.ibm.com>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg

Oleg Nesterov <oleg@...hat.com> writes:

> On 05/04, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > On 05/04, Eric W. Biederman wrote:
>> >>
>> >> Oleg Nesterov <oleg@...hat.com> writes:
>> >>
>> >> > I'd vote for the change in exec_mmap(). This way mm_init_memcg() can just
>> >> > nullify mm->memcg.
>> >>
>> >> There is at least one common path where we need the memory control group
>> >> properly initialized so memory allocations don't escape the memory
>> >> control group.
>> >>
>> >> do_execveat_common
>> >>    copy_strings
>> >>       get_arg_page
>> >>          get_user_pages_remote
>> >>             __get_user_pages_locked
>> >>                __get_user_pages
>> >>                   faultin_page
>> >>                      handle_mm_fault
>> >>                         count_memcg_event_mm
>> >>                         __handle_mm_fault
>> >>                           handle_pte_fault
>> >>                              do_anonymous_page
>> >>                                 mem_cgroup_try_charge
>
> Ah yes, indeed.
>
>> mm_init_memcg is at the same point as mm_init_owner.  So my change did
>> not introduce any logic changes on when the memory control group became
>> valid.
>
> Not sure, but perhaps I am all confused....

So.  In exec actions that are initiated while a process is calling exec
can either logcially happen either before or after the exec.  That is
the way these races are always resolved.  I don't see any reason
for the memory control group to be any different.

Which means it is perfectly valid for a migration that technically
happens in the middle of copy_strings to not show up until some time
later in exec.

This works because there are no user visible effects until exec
completes.  If there are user visible effects they are bugs.

That is all it takes to understand why my patch fixes things.

It might make more sense to simply fail migration if task->in_execve.
While perhaps the best solution that might introduce a user visible
effect that we don't care about right now.

> before your patch get_mem_cgroup_from_mm() looks at mm->owner == current
> (in this case) and mem_cgroup_from_task() should return the correct memcg
> even if execing task migrates after bprm_mm_init(). At least in the common
> case when the old mm is not shared.
>
> After your patch the memory allocations in copy_strings() won't be accounted
> correctly, bprm->mm->memcg is wrong if this task migrates. And iiuc your recent
> "[PATCH 2/2] memcg: Close the race between migration and installing bprm->mm as mm"
> doesn't fix the problem.
>
> No?

The patch does solve the issue.  There should be nothing a userspace
process can observe that should tell it where in the middle of exec
such a migration happend so placing the migration at what from the
kernel's perspective might be technically later should not be a problem.

If it is a problem the issue is that there is a way to observe the
difference.

> Perhaps we can change get_mem_cgroup_from_mm() to use
> mem_cgroup_from_css(current, memory_cgrp_id) if mm->memcg == NULL?

Please God no.  Having any unnecessary special case is just going to
confuse people and cause bugs.

Plus as I have pointed out above that is not an issue.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ