[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20091028135519.805c4789.kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 28 Oct 2009 13:55:19 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: David Rientjes <rientjes@...gle.com>
Cc: vedran.furac@...il.com, Hugh Dickins <hugh.dickins@...cali.co.uk>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
minchan.kim@...il.com, Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: Memory overcommit
On Tue, 27 Oct 2009 21:08:56 -0700 (PDT)
David Rientjes <rientjes@...gle.com> wrote:
> On Wed, 28 Oct 2009, Vedran Furac wrote:
>
> > > This is wrong; it doesn't "emulate oom" since oom_kill_process() always
> > > kills a child of the selected process instead if they do not share the
> > > same memory. The chosen task in that case is untouched.
> >
> > OK, I stand corrected then. Thanks! But, while testing this I lost X
> > once again and "test" survived for some time (check the timestamps):
> >
> > http://pastebin.com/d5c9d026e
> >
> > - It started by killing gkrellm(!!!)
> > - Then I lost X (kdeinit4 I guess)
> > - Then 103 seconds after the killing started, it killed "test" - the
> > real culprit.
> >
> > I mean... how?!
> >
>
> Here are the five oom kills that occurred in your log, and notice that the
> first four times it kills a child and not the actual task as I explained:
>
> [97137.724971] Out of memory: kill process 21485 (VBoxSVC) score 1564940 or a child
> [97137.725017] Killed process 21503 (VirtualBox)
> [97137.864622] Out of memory: kill process 11141 (kdeinit4) score 1196178 or a child
> [97137.864656] Killed process 11142 (klauncher)
> [97137.888146] Out of memory: kill process 11141 (kdeinit4) score 1184308 or a child
> [97137.888180] Killed process 11151 (ksmserver)
> [97137.972875] Out of memory: kill process 11141 (kdeinit4) score 1146255 or a child
> [97137.972888] Killed process 11224 (audacious2)
>
> Those are practically happening simultaneously with very little memory
> being available between each oom kill. Only later is "test" killed:
>
> [97240.203228] Out of memory: kill process 5005 (test) score 256912 or a child
> [97240.206832] Killed process 5005 (test)
>
> Notice how the badness score is less than 1/4th of the others. So while
> you may find it to be hogging a lot of memory, there were others that
> consumed much more.
not related to child-parent problem.
Seeing this number more.
==
[97137.709272] Active_anon:671487 active_file:82 inactive_anon:132316
[97137.709273] inactive_file:82 unevictable:50 dirty:0 writeback:0 unstable:0
[97137.709273] free:6122 slab:17179 mapped:30661 pagetables:8052 bounce:0
==
acitve_file + inactive_file is very low. Almost all pages are for anon.
But "mapped(NR_FILE_MAPPED)" is a little high. This implies remaining file caches
are mapped by many processes OR some mega bytes of shmem is used.
# of pagetables is 8052, this means
8052x4096/8*4k bytes = 16Gbytes of mapped area.
Total available memory is near to be active/inactive + slab
671487+82+132316+82+50+6122+17179+8052=835370x4k= 3.2Gbytes ?
(this system is swapless)
Then, considering the pmap kosaki shows,
I guess killed ones had big total_vm but has not much real rss,
and no helps for oom.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists