linux-kernel - Re: [PATCH 3/5] oom: oom-killer don't use proportion of system-ram internally

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1105231547060.17840@chino.kir.corp.google.com>
Date:	Mon, 23 May 2011 15:48:42 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	caiqian@...hat.com
cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>, hughd@...gle.com,
	kamezawa.hiroyu@...fujitsu.com, minchan.kim@...il.com,
	oleg@...hat.com
Subject: Re: [PATCH 3/5] oom: oom-killer don't use proportion of system-ram
 internally

On Mon, 23 May 2011, David Rientjes wrote:

> I already suggested an alternative patch to CAI Qian to greatly increase 
> the granularity of the oom score from a range of 0-1000 to 0-10000 to 
> differentiate between tasks within 0.01% of available memory (16MB on CAI 
> Qian's 16GB system).  I'll propose this officially in a separate email.
> 

This is an alternative patch as earlier proposed with suggested 
improvements from Minchan.  CAI, would it be possible to test this out on 
your usecase?

I'm indifferent to the actual scale of OOM_SCORE_MAX_FACTOR; it could be 
10 as proposed in this patch or even increased higher for higher 
resolution.


diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -38,6 +38,9 @@ int sysctl_oom_kill_allocating_task;
 int sysctl_oom_dump_tasks = 1;
 static DEFINE_SPINLOCK(zone_scan_lock);
 
+#define OOM_SCORE_MAX_FACTOR	10
+#define OOM_SCORE_MAX		(OOM_SCORE_ADJ_MAX * OOM_SCORE_MAX_FACTOR)
+
 #ifdef CONFIG_NUMA
 /**
  * has_intersects_mems_allowed() - check task eligiblity for kill
@@ -160,7 +163,7 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem,
 	 */
 	if (p->flags & PF_OOM_ORIGIN) {
 		task_unlock(p);
-		return 1000;
+		return OOM_SCORE_MAX;
 	}
 
 	/*
@@ -177,32 +180,38 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem,
 	points = get_mm_rss(p->mm) + p->mm->nr_ptes;
 	points += get_mm_counter(p->mm, MM_SWAPENTS);
 
-	points *= 1000;
+	points *= OOM_SCORE_MAX;
 	points /= totalpages;
 	task_unlock(p);
 
 	/*
-	 * Root processes get 3% bonus, just like the __vm_enough_memory()
-	 * implementation used by LSMs.
+	 * Root processes get a bonus of 1% per 10% of memory used.
 	 */
-	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
-		points -= 30;
+	if (has_capability_noaudit(p, CAP_SYS_ADMIN)) {
+		int bonus;
+		int granularity;
+
+		bonus = OOM_SCORE_MAX / 100;		/* bonus is 1% */
+		granularity = OOM_SCORE_MAX / 10;	/* granularity is 10% */
+
+		points -= bonus * (points / granularity);
+	}
 
 	/*
 	 * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may
 	 * either completely disable oom killing or always prefer a certain
 	 * task.
 	 */
-	points += p->signal->oom_score_adj;
+	points += p->signal->oom_score_adj * OOM_SCORE_MAX_FACTOR;
 
 	/*
 	 * Never return 0 for an eligible task that may be killed since it's
-	 * possible that no single user task uses more than 0.1% of memory and
+	 * possible that no single user task uses more than 0.01% of memory and
 	 * no single admin tasks uses more than 3.0%.
 	 */
 	if (points <= 0)
 		return 1;
-	return (points < 1000) ? points : 1000;
+	return (points < OOM_SCORE_MAX) ? points : OOM_SCORE_MAX;
 }
 
 /*
@@ -314,7 +323,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 			 */
 			if (p == current) {
 				chosen = p;
-				*ppoints = 1000;
+				*ppoints = OOM_SCORE_MAX;
 			} else {
 				/*
 				 * If this task is not being ptraced on exit,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/