linux-kernel - Re: [RFC PATCH] mm, oom: introduce oom reaper

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151126173100.GA22922@dhcp22.suse.cz>
Date:	Thu, 26 Nov 2015 18:31:00 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	hannes@...xchg.org, linux-mm@...ck.org, akpm@...ux-foundation.org,
	torvalds@...ux-foundation.org, mgorman@...e.de,
	rientjes@...gle.com, riel@...hat.com, hughd@...gle.com,
	oleg@...hat.com, andrea@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm, oom: introduce oom reaper

On Thu 26-11-15 17:34:56, Michal Hocko wrote:
> On Fri 27-11-15 00:24:43, Tetsuo Handa wrote:
> > Michal Hocko wrote:
[...]
> > > +	tlb_gather_mmu(&tlb, mm, 0, -1);
> > > +	for (vma = mm->mmap ; vma; vma = vma->vm_next) {
> > > +		if (is_vm_hugetlb_page(vma))
> > > +			continue;
> > > +
> > > +		/*
> > > +		 * Only anonymous pages have a good chance to be dropped
> > > +		 * without additional steps which we cannot afford as we
> > > +		 * are OOM already.
> > > +		 */
> > > +		if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED))
> > > +			unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end,
> > > +					 &details);
> > 
> > How do you plan to make sure that reclaimed pages are used by
> > fatal_signal_pending() tasks?
> > http://lkml.kernel.org/r/201509242050.EHE95837.FVFOOtMQHLJOFS@I-love.SAKURA.ne.jp
> > http://lkml.kernel.org/r/201510121543.EJF21858.LtJFHOOOSQVMFF@I-love.SAKURA.ne.jp
> 
> Well the wake_oom_reaper is responsible to hand over mm of the OOM
> victim and as such it should be a killed process.  I guess you mean that
> the mm might be shared with another process which is hidden from the OOM
> killer, right? Well I think this is not something to care about at this
> layer. We shouldn't select a tasks which can lead to this situation in
> the first place. Such an oom victim is basically selected incorrectly. I
> think we can handle that by a flag in mm_struct.
> 
> I guess we have never cared too much about this case because it sounds
> like a misconfiguration. If you want to shoot your own head the do it.
> It is true that this patch will make such a case more prominent because
> we can cause a side effect now. I can cook up a patch to help to sort
> this out.
> 
> Thanks for pointing this out.

OK, so I was thinking about that some more and came to the conclusion
that we cannot use per mm struct flag. This would be basically
equivalent to moving oom_score_adj to mm_struct which has shown to be a
problem in the past (especially for vfork(); set_oom_score; execve()
loads). So I've ended up with the following. It would mean we will not
use the reaper in some cases but those should be marginal and some of
them even dubious at best.
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 043c0fe146a5..bceeebe96a1b 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -646,6 +646,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	unsigned int victim_points = 0;
 	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
 					      DEFAULT_RATELIMIT_BURST);
+	bool can_oom_reap = true;
 
 	/*
 	 * If the task is already exiting, don't alarm the sysadmin or kill
@@ -736,15 +737,22 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 			continue;
 		if (same_thread_group(p, victim))
 			continue;
-		if (unlikely(p->flags & PF_KTHREAD))
-			continue;
-		if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
+		if (unlikely(p->flags & PF_KTHREAD) ||
+		    p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
+			/*
+			 * We cannot usee oom_reaper for the mm shared by this process
+			 * because it wouldn't get killed and so the memory might be
+			 * still used.
+			 */
+			can_oom_reap = false;
 			continue;
+		}
 
 		do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
 	}
 	rcu_read_unlock();
-	wake_oom_reaper(mm);
+	if (can_oom_reap)
+		wake_oom_reaper(mm);
 	mmdrop(mm);
 	put_task_struct(victim);
 }
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/