linux-kernel - Re: [patch 0/7] improve memcg oom killer robustness v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130905132415.GD13666@dhcp22.suse.cz>
Date:	Thu, 5 Sep 2013 15:24:15 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	azurIt <azurit@...ox.sk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-mm@...ck.org, cgroups@...r.kernel.org, x86@...nel.org,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 0/7] improve memcg oom killer robustness v2

On Thu 05-09-13 07:54:30, Johannes Weiner wrote:
[...]
> From: Johannes Weiner <hannes@...xchg.org>
> Subject: [patch] mm: memcg: handle non-error OOM situations more gracefully
> 
> Many places that can trigger a memcg OOM situation return gracefully
> and don't propagate VM_FAULT_OOM up the fault stack.
> 
> It's not practical to annotate all of them to disable the memcg OOM
> killer.  Instead, just clean up any set OOM state without warning in
> case the fault is not returning VM_FAULT_OOM.
> 
> Also fail charges immediately when the current task already is in an
> OOM context.  Otherwise, the previous context gets overwritten and the
> memcg reference is leaked.

Could you paste find_or_create_page called from __get_blk as an example
here, please? So that we do not have to scratch our heads again later...

Also task_in_memcg_oom could be stuffed into mem_cgroup_disable_oom
branch to reduce an overhead for in-kernel faults. The overhead
shouldn't be noticeable so I am not sure this is that important.

> Signed-off-by: Johannes Weiner <hannes@...xchg.org>

I do not see any easier way to fix this without returning back to the
old behavior which is much worse.

Acked-by: Michal Hocko <mhocko@...e.cz>

Thanks!

> diff --git a/mm/memory.c b/mm/memory.c
> index cdbe41b..cdad471 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -57,7 +57,6 @@
>  #include <linux/swapops.h>
>  #include <linux/elf.h>
>  #include <linux/gfp.h>
> -#include <linux/stacktrace.h>
>  
>  #include <asm/io.h>
>  #include <asm/pgalloc.h>
> @@ -3521,11 +3520,8 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	if (flags & FAULT_FLAG_USER)
>  		mem_cgroup_disable_oom();
>  
> -	if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))) {
> -		printk("Fixing unhandled memcg OOM context set up from:\n");
> -		print_stack_trace(&current->memcg_oom.trace, 0);
> -		mem_cgroup_oom_synchronize();
> -	}
> +	if (task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))
> +		mem_cgroup_oom_synchronize(false);
>  
>  	return ret;
>  }
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index aa60863..3bf664c 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -785,7 +785,7 @@ out:
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	if (mem_cgroup_oom_synchronize())
> +	if (mem_cgroup_oom_synchronize(true))
>  		return;
>  	if (try_set_system_oom()) {
>  		out_of_memory(NULL, 0, 0, NULL);
> -- 
> 1.8.4
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/