[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1011150215460.2986@chino.kir.corp.google.com>
Date: Mon, 15 Nov 2010 02:34:43 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Ying Han <yinghan@...gle.com>, Bodo Eggert <7eggert@....de>,
Mandeep Singh Baines <msb@...gle.com>,
"Figo.zhang" <figo1802@...il.com>
Subject: Re: [PATCH] Revert oom rewrite series
On Mon, 15 Nov 2010, KOSAKI Motohiro wrote:
> Of cource, I denied. He seems to think number of email is meaningful than
> how talk about. but it's incorrect and makes no sense. Why not? Also, He
> have to talk about logically. "Hey, I think it's not bug" makes no sense.
> Such claim don't solve anything. userland is still unhappy. Why not?
> I want to quickly action.
>
If there are pending complaints or bugs that I haven't addressed, please
bring them to my attention. To date, I know of no issues that have been
raised that I have not addressed; you're always free to disagree with my
position, but in the end you may find that when the kernel moves in a
different direction that you should begin to accept it.
> That said, If anyone want to change userland ABI, Be carefully. They have
> to investigate userland usecase carefully and avoid to break them carefully
> again. If someone think "hey, It's no big matter. userland rewritten can solve
> an issue", I strongly disagree. they don't understand why all of userland
> applications rewritten is harmful.
>
You may remember that the initial version of my rewrite replaced oom_adj
entirely with the new oom_score_adj semantics. Others suggested that it
be seperated into a new tunable and the old tunable deprecated for a
lengthy period of time. I accepted that criticism and understood the
drawbacks of replacing the tunable immediately and followed those
suggestions. I disagree with you that the deprecation of oom_adj for a
period of two years is as dramatic as you imply and I disagree that users
are experiencing problems with the linear scale that it now operates on
versus the old exponential scale.
> 1) About two month ago, Dave hansen observed strange OOM issue because he
> has a big machine and ALL process are not so big. thus, eventually all
> process got oom-score=0 and oom-killer didn't work.
>
> https://kerneltrap.org/mailarchive/linux-driver-devel/2010/9/9/6886383
>
> DavidR changed oom-score to +1 in such situation.
>
> http://kerneltrap.org/mailarchive/linux-kernel/2010/9/9/4617455
>
> But it is completely bognus. If all process have score=1, oom-killer fall
> back to purely random killer. I expected and explained his patch has
> its problem at half years ago. but he didn't fix yet.
>
The resolution with which the oom killer considers memory is at 0.1% of
system RAM at its highest (smaller when you have a memory controller,
cpuset, or mempolicy constrained oom). It considers a task within 0.1% of
memory of another task to have equal "badness" to kill, we don't break
ties in between that resolution -- it all depends on which one shows up in
the tasklist first. If you disagree with that resolution, which I support
as being high enough, then you may certainly propose a patch to make it
even finer at 0.01%, 0.001%, etc. It would only change oom_badness() to
range between [0,10000], [0,100000], etc.
> 2) Also half years ago, I did explained oom_adj is used from multiple
> applications. And we can't break them. But DavidR didn't fix.
>
And we didn't. oom_adj is still there and maps linearly to oom_score_adj;
you just can't show a single application where that mapping breaks because
it was based on an actual calculation.
If you would like to cite these "multiple" applications that need to be
converted to use oom_score_adj (I know of udev), please let me know and
if they're open-source applications then I will commit to submitting
patches for them myself. I believe the two year window is sufficient for
everyone else, though.
> 3) Also about four month ago, I and kamezawa-san pointed out his patch
> don't work on memcg. It also haven't been fixed.
>
I don't know what you're referring to here, sorry.
> In the other hand, You can't explain what worth OOM-rewritten patch has.
> Because there is nothing. It is only "powerful"(TM) for Google. but
> instead It has zero worth for every other people. Here is just technical
> issue. Bah.
>
Please see my reply to Figo.zhang where I enumerate the four reasons why
the new userspace tunable is more powerful than oom_adj.
At this point, I can only speculate that your distaste for the new oom
killer is one of disposition; it seems like everytime you reply to an
email (or, more regularly, just repost your revert) that you come into it
with the attitude that my response cannot possibly be correct and that the
way you see things is exactly as they should be. If you were to consider
other people's opinions, however, you may find some common ground that can
be met. I certainly did that when I introduced oom_score_adj instead of
replacing oom_adj immediatley. I also did it when I removed the forkbomb
detector from the rewrite. I also did it when considering swap in the
heuristic when it initially was only rss. Andrew is in the position where
he has to make a judgment call on what should be included and what
shouldn't and it should be pretty darn clear after you post your revert
the first time, then the second time, then the third time, then the fourth
time, and now the fifth time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists