linux-kernel - Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.02.1312131551220.28704@chino.kir.corp.google.com>
Date:	Fri, 13 Dec 2013 15:55:44 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Michal Hocko <mhocko@...e.cz>
cc:	Johannes Weiner <hannes@...xchg.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	cgroups@...r.kernel.org
Subject: Re: [patch 1/2] mm, memcg: avoid oom notification when current needs
 access to memory reserves

On Thu, 12 Dec 2013, Michal Hocko wrote:

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c72b03bf9679..5cb1deea6aac 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2256,15 +2256,16 @@ bool mem_cgroup_oom_synchronize(bool handle)
>  
>  	locked = mem_cgroup_oom_trylock(memcg);
>  
> -	if (locked)
> -		mem_cgroup_oom_notify(memcg);
> -
>  	if (locked && !memcg->oom_kill_disable) {
>  		mem_cgroup_unmark_under_oom(memcg);
>  		finish_wait(&memcg_oom_waitq, &owait.wait);
> +		/* calls mem_cgroup_oom_notify if there is a task to kill */
>  		mem_cgroup_out_of_memory(memcg, current->memcg_oom.gfp_mask,
>  					 current->memcg_oom.order);
>  	} else {
> +		if (locked && memcg->oom_kill_disable)
> +			mem_cgroup_oom_notify(memcg);
> +
>  		schedule();
>  		mem_cgroup_unmark_under_oom(memcg);
>  		finish_wait(&memcg_oom_waitq, &owait.wait);
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 1e4a600a6163..2a7f15900922 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -470,6 +470,9 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
>  		victim = p;
>  	}
>  
> +	if (memcg)
> +		mem_cgroup_oom_notify(memcg);
> +
>  	/* mm cannot safely be dereferenced after task_unlock(victim) */
>  	mm = victim->mm;
>  	pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",

Yes, I like this.

> The semantic would be as simple as "notification is sent only when
> an action is due". It will be still racy as nothing prevents a task
> which is not under OOM to exit and release some memory but there is no
> sensible way to address that. On the other hand such a semantic would be
> sensible for oom_control listeners because they will know that an action
> has to be or will be taken (the line was drawn).
> 

I think this makes absolute sense and is in agreement with what is 
described in Documentation/cgroups/memory.txt.

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c72b03bf9679..fee25c5934d2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2692,7 +2693,8 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
>  	 * MEMDIE process.
>  	 */
>  	if (unlikely(test_thread_flag(TIF_MEMDIE)
> -		     || fatal_signal_pending(current)))
> +		     || fatal_signal_pending(current))
> +		     || current->flags & PF_EXITING)
>  		goto bypass;
>  
>  	if (unlikely(task_in_memcg_oom(current)))
> 
> rather than the later checks down the oom_synchronize paths. The comment
> already mentions dying process...
> 

This is scary because it doesn't even try to reclaim memcg memory before 
allowing the allocation to succeed.  I think we could even argue that we 
should move the fatal_signal_pending(current) check to later and the only 
condition we should really be bypassing here is TIF_MEMDIE since it will 
only get set when reclaim has already failed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/