linux-kernel - Re: Possible race condition in oom-killer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170728132952.GQ2274@dhcp22.suse.cz>
Date:   Fri, 28 Jul 2017 15:29:53 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:     mjaggi@...iumnetworks.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: Possible race condition in oom-killer

On Fri 28-07-17 22:15:01, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > > 4578 is consuming memory as mlocked pages. But the OOM reaper cannot reclaim
> > > mlocked pages (i.e. can_madv_dontneed_vma() returns false due to VM_LOCKED), can it?
> > 
> > You are absolutely right. I am pretty sure I've checked mlocked counter
> > as the first thing but that must be from one of the earlier oom reports.
> > My fault I haven't checked it in the critical one
> > 
> > [  365.267347] oom_reaper: reaped process 4578 (oom02), now anon-rss:131559616kB, file-rss:0kB, shmem-rss:0kB
> > [  365.282658] oom_reaper: reaped process 4583 (oom02), now anon-rss:131561664kB, file-rss:0kB, shmem-rss:0kB
> > 
> > and the above screemed about the fact I was just completely blind.
> > 
> > mlock pages handling is on my todo list for quite some time already but
> > I didn't get around it to implement that. mlock code is very tricky.
> 
> task_will_free_mem(current) in out_of_memory() returning false due to
> MMF_OOM_SKIP already set allowed each thread sharing that mm to select a new
> OOM victim. If task_will_free_mem(current) in out_of_memory() did not return
> false, threads sharing MMF_OOM_SKIP mm would not have selected new victims
> to the level where all OOM killable processes are killed and calls panic().

I am not sure I understand. Do you mean this?
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f030c1c..671e4a4107d0 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -779,13 +779,6 @@ static bool task_will_free_mem(struct task_struct *task)
 	if (!__task_will_free_mem(task))
 		return false;
 
-	/*
-	 * This task has already been drained by the oom reaper so there are
-	 * only small chances it will free some more
-	 */
-	if (test_bit(MMF_OOM_SKIP, &mm->flags))
-		return false;
-
 	if (atomic_read(&mm->mm_users) <= 1)
 		return true;
 
If yes I would have to think about this some more because that might
have weird side effects (e.g. oom_victims counting after threads passed
exit_oom_victim).

Anyway the proper fix for this is to allow reaping mlocked pages. Is
something other than the LTP test affected to give this more priority?
Do we have other usecases where something mlocks the whole memory?
-- 
Michal Hocko
SUSE Labs