[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871serfk77.fsf@xmission.com>
Date: Fri, 04 May 2018 11:40:28 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@...hat.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Kirill Tkhai <ktkhai@...tuozzo.com>, akpm@...ux-foundation.org,
peterz@...radead.org, viro@...iv.linux.org.uk, mingo@...nel.org,
paulmck@...ux.vnet.ibm.com, keescook@...omium.org, riel@...hat.com,
tglx@...utronix.de, kirill.shutemov@...ux.intel.com,
marcos.souza.org@...il.com, hoeun.ryu@...il.com,
pasha.tatashin@...cle.com, gs051095@...il.com, dhowells@...hat.com,
rppt@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg
Oleg Nesterov <oleg@...hat.com> writes:
> On 05/04, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > I'd vote for the change in exec_mmap(). This way mm_init_memcg() can just
>> > nullify mm->memcg.
>>
>> There is at least one common path where we need the memory control group
>> properly initialized so memory allocations don't escape the memory
>> control group.
>>
>> do_execveat_common
>> copy_strings
>> get_arg_page
>> get_user_pages_remote
>> __get_user_pages_locked
>> __get_user_pages
>> faultin_page
>> handle_mm_fault
>> count_memcg_event_mm
>> __handle_mm_fault
>> handle_pte_fault
>> do_anonymous_page
>> mem_cgroup_try_charge
>>
>> I am surprised I can't easily find more. Apparently in load_elf_binary
>> we call elf_mmap after set_new_exec and install_exec_creds, making
>> a gracefull recovery from elf_mmap failures impossible.
>>
>> In any case we most definitely need the memory control group properly
>> setup before exec_mmap.
>
> Confused ...
>
> new_mm->memcg has no effect until exec_mmap(), why it can't be NULL ?
new_mm->memcg does have effect before exec_mmap.
> and why do you think mem_cgroup_try_charge() can use the wrong memcg
> in this case?
It would only use the wrong memcg if you had bprm->mm->memcg == NULL.
mm_init_memcg is at the same point as mm_init_owner. So my change did
not introduce any logic changes on when the memory control group became
valid.
get_user_pages_remote is passed bprm->mm. So all of the above happens
on bprm->mm. Then bprm->mm->memcg is charged and fails if the memory
control group is full.
It would actually be pointless to allocate and initialized bprm->mm as
early as we do if we were not using it before exec_mmap. Using bprm->mm
implies there will be charges to the bprm->mm->memcg.
If bprm->mm was not used until exec_mmap we could just delay the
allocation until there and avoid these kind of challenges.
Eric
Powered by blists - more mailing lists