[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090819195242.4454a35f.minchan.kim@barrios-desktop>
Date: Wed, 19 Aug 2009 19:52:42 +0900
From: Minchan Kim <minchan.kim@...il.com>
To: Mel Gorman <mel@....ul.ie>
Cc: Minchan Kim <minchan.kim@...il.com>,
????????? <chungki.woo@...il.com>, ngupta@...are.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
fengguang.wu@...el.com, riel@...hat.com, akpm@...ux-foundation.org,
kosaki.motohiro@...fujitsu.com
Subject: Re: abnormal OOM killer message
Thanks for good comment, Mel.
On Wed, 19 Aug 2009 11:36:11 +0100
Mel Gorman <mel@....ul.ie> wrote:
> On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
> > On Wed, 19 Aug 2009 15:24:54 +0900
> > ????????? <chungki.woo@...il.com> wrote:
> >
> > > Thank you very much for replys.
> > >
> > > But I think it seems not to relate with stale data problem in compcache.
> > > My question was why last chance to allocate memory was failed.
> > > When OOM killer is executed, memory state is not a condition to
> > > execute OOM killer.
> > > Specially, there are so many pages of order 0. And allocating order is zero.
> > > I think that last allocating memory should have succeeded.
> > > That's my worry.
> >
> > Yes. I agree with you.
> > Mel. Could you give some comment in this situation ?
> > Is it possible that order 0 allocation is failed
> > even there are many pages in buddy ?
> >
>
> Not ordinarily. If it happens, I tend to suspect that the free list data
> is corrupted and would put a check in __rmqueue() that looked like
>
> BUG_ON(list_empty(&area->free_list) && area->nr_free);
If memory is corrupt, it would be not satisfied with both condition.
It would be better to ORed condition.
BUG_ON(list_empty(&area->free_list) || area->nr_free);
> The second question is, why are we in direct reclaim this far above the
> watermark? It should only be kswapd that is doing any reclaim at that
> point. That makes me wonder again are the free lists corrupted.
It does make sense!
> The other possibility is that the zonelist used for allocation in the
> troubled path contains no populated zones. I would put a BUG_ON check in
> get_page_from_freelist() to check if the first zone in the zonelist has no
> pages. If that bug triggers, it might explain why OOMs are triggering for
> no good reason.
Yes. Chungki. Could you put the both BUG_ON in each function and
try to reproduce the problem ?
> I consider both of those possibilities abnormal though.
>
> > >
> > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > > page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> > > <== this is last chance
> > > zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> > > <== uses ALLOC_WMARK_HIGH
> > > if (page)
> > > goto got_pg;
> > >
> > > out_of_memory(zonelist, gfp_mask, order);
> > > goto restart;
> > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > > Let me have a question.
> > > > Now the system has 79M as total swap.
> > > > It's bigger than system memory size.
> > > > Is it possible in compcache?
> > > > Can we believe the number?
> > >
> > > Yeah, It's possible. 79Mbyte is data size can be swap.
> > > It's not compressed data size. It's just original data size.
> >
> > You means your pages with 79M are swap out in compcache's reserved
> > memory?
> >
>
> --
> Mel Gorman
> Part-time Phd Student Linux Technology Center
> University of Limerick IBM Dublin Software Lab
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists