linux-kernel - Re: OOM Killer don't works at all if the system have >gigabytes memory (was Re: [PATCH] mm: check zone->all_unreclaimable in all

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 10 May 2011 16:22:31 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
cc:	CAI Qian <caiqian@...hat.com>, avagin@...il.com,
	Andrey Vagin <avagin@...nvz.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mel@....ul.ie>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Minchan Kim <minchan.kim@...il.com>,
	Hugh Dickins <hughd@...gle.com>,
	Oleg Nesterov <oleg@...hat.com>
Subject: Re: OOM Killer don't works at all if the system have >gigabytes
 memory (was Re: [PATCH] mm: check zone->all_unreclaimable in
 all_unreclaimable())

On Tue, 10 May 2011, KOSAKI Motohiro wrote:

> OK. That's known issue. Current OOM logic doesn't works if you have
> gigabytes RAM. because _all_ process have the exactly same score (=1).
> then oom killer just fallback to random process killer. It was made
> commit a63d83f427 (oom: badness heuristic rewrite). I pointed out
> it at least three times. You have to blame Google folks. :-/
> 

If all threads have the same badness score, which by definition must be 1 
since that is the lowest badness score possible for an eligible thread, 
then each thread is using < 0.2% of RAM.

The granularity of the badness score doesn't differentiate between threads  
using 0.1% of RAM in terms of priority for kill (in this case, 16MB).  The 
largest consumers of memory from CAI's log have an rss of 336MB, which is 
~2% of system RAM.  The problem is that these are forked by root and 
therefore get a 3% bonus, making their badness score 1 instead of 2.

 [ You also don't have to blame "Google folks," I rewrote the oom
   killer. ]

> 
> The problems are three.
> 
> 1) if two processes have the same oom score, we should kill younger process.
>    but current logic kill older. Oldest processes are typicall system daemons.

Agreed, that seems advantageous to prefer killing threads that have done 
the least amount of work (defined as those with the least runtime compared 
to others in the tasklist order) over others.

> 2) Current logic use 'unsigned int' for internal score calculation. (exactly says,
>    it only use 0-1000 value). its very low precision calculation makes a lot of
>    same oom score and kill an ineligible process.

The range of 0-1000 allows us to differentiate tasks up to 0.1% of system 
RAM from each other when making oom kill decisions.  If we really want to 
increase this granularity, we could increase the value to 10000 and then 
multiple oom_score_adj values by 10.

> 3) Current logic give 3% of SystemRAM to root processes. It obviously too big
>    if you have plenty memory. Now, your fork-bomb processes have 500MB OOM immune
>    bonus. then your fork-bomb never ever be killed.
> 

I agree that a constant proportion for root processes is probably not 
ideal, especially in situations where there are many small threads that 
only use about 1% of system RAM, such as in CAI's case.  I don't agree 
that we need to guard against forkbombs created by root, however.  The 
worst case scenario is that the continuous killing of non-root threads 
will allow the admin to fix his or her error.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/