linux-kernel - Re: Re: [PATCH] mm,oom_kill: Close race window of needlessly selecting new victims.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1706211325340.101895@chino.kir.corp.google.com>
Date:   Wed, 21 Jun 2017 13:31:03 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
cc:     mhocko@...nel.org, akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH] mm,oom_kill: Close race window of needlessly selecting
 new victims.

On Wed, 21 Jun 2017, Tetsuo Handa wrote:

> Umm... So, you are pointing out that select_bad_process() aborts based on
> TIF_MEMDIE or MMF_OOM_SKIP is broken because victim threads can be removed
>  from global task list or cgroup's task list. Then, the OOM killer will have to
> wait until all mm_struct of interested OOM domain (system wide or some cgroup)
> is reaped by the OOM reaper. Simplest way is to wait until all mm_struct are
> reaped by the OOM reaper, for currently we are not tracking which memory cgroup
> each mm_struct belongs to, are we? But that can cause needless delay when
> multiple OOM events occurred in different OOM domains. Do we want to (and can we)
> make it possible to tell whether each mm_struct queued to the OOM reaper's list
> belongs to the thread calling out_of_memory() ?
> 

I am saying that taking mmget() in mark_oom_victim() and then only 
dropping it with mmput_async() after it can grab mm->mmap_sem, which the 
exit path itself takes, or the oom reaper happens to schedule, causes 
__mmput() to be called much later and thus we remove the process from the 
tasklist or call cgroup_exit() earlier than the memory can be unmapped 
with your patch.  As a result, subsequent calls to the oom killer kills 
everything before the original victim's mm can undergo __mmput() because 
the oom reaper still holds the reference.