[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1105231522410.17840@chino.kir.corp.google.com>
Date: Mon, 23 May 2011 15:28:29 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, caiqian@...hat.com, hughd@...gle.com,
kamezawa.hiroyu@...fujitsu.com, minchan.kim@...il.com,
oleg@...hat.com
Subject: Re: [PATCH 3/5] oom: oom-killer don't use proportion of system-ram
internally
On Fri, 20 May 2011, KOSAKI Motohiro wrote:
> CAI Qian reported his kernel did hang-up if he ran fork intensive
> workload and then invoke oom-killer.
>
> The problem is, current oom calculation uses 0-1000 normalized value
> (The unit is a permillage of system-ram). Its low precision make
> a lot of same oom score. IOW, in his case, all processes have smaller
> oom score than 1 and internal calculation round it to 1.
>
> Thus oom-killer kill ineligible process. This regression is caused by
> commit a63d83f427 (oom: badness heuristic rewrite).
>
> The solution is, the internal calculation just use number of pages
> instead of permillage of system-ram. And convert it to permillage
> value at displaying time.
>
> This patch doesn't change any ABI (included /proc/<pid>/oom_score_adj)
> even though current logic has a lot of my dislike thing.
>
Same response as when you initially proposed this patch:
http://marc.info/?l=linux-kernel&m=130507086613317 -- you never replied to
that.
The changelog doesn't accurately represent CAI Qian's problem; the issue
is that root processes are given too large of a bonus in comparison to
other threads that are using at most 1.9% of available memory. That can
be fixed, as I suggested by giving 1% bonus per 10% of memory used so that
the process would have to be using 10% before it even receives a bonus.
I already suggested an alternative patch to CAI Qian to greatly increase
the granularity of the oom score from a range of 0-1000 to 0-10000 to
differentiate between tasks within 0.01% of available memory (16MB on CAI
Qian's 16GB system). I'll propose this officially in a separate email.
This patch also includes undocumented changes such as changing the bonus
given to root processes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists