[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTikZeDiNwh+hihEMWwGyh6+ZVMA=_A@mail.gmail.com>
Date: Wed, 18 May 2011 22:41:01 -0400
From: Andrew Lutomirski <luto@....edu>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: Minchan Kim <minchan.kim@...il.com>,
Wu Fengguang <fengguang.wu@...el.com>,
Andi Kleen <andi@...stfloor.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Mel Gorman <mgorman@...e.de>,
Johannes Weiner <hannes@...xchg.org>,
Rik van Riel <riel@...hat.com>
Subject: Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux)
On Wed, May 18, 2011 at 10:30 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@...fujitsu.com> wrote:
> On Wed, 18 May 2011 22:15:53 -0400
> Andrew Lutomirski <luto@....edu> wrote:
>
>> On Wed, May 18, 2011 at 1:17 AM, Minchan Kim <minchan.kim@...il.com> wrote:
>> > On Wed, May 18, 2011 at 4:22 AM, Andrew Lutomirski <luto@....edu> wrote:
>
>> > Andrew, Could you test this patch with !pgdat_balanced patch?
>> > I think we shouldn't see OOM message if we have lots of free swap space.
>> >
>> > == CUT_HERE ==
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index f73b865..cc23f04 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -1341,10 +1341,6 @@ static inline bool
>> > should_reclaim_stall(unsigned long nr_taken,
>> > if (current_is_kswapd())
>> > return false;
>> >
>> > - /* Only stall on lumpy reclaim */
>> > - if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
>> > - return false;
>> > -
>> > /* If we have relaimed everything on the isolated list, no stall */
>> > if (nr_freed == nr_taken)
>> > return false;
>> >
>> >
>> >
>> > Then, if you don't see any unnecessary OOM but still see the hangup,
>> > could you apply this patch based on previous?
>>
>> With this patch, I started GNOME and Firefox, turned on swap, and ran
>> test_mempressure.sh 1500 1400 1. Instant panic (or OOPS and hang or
>> something -- didn't get the top part). Picture attached -- it looks
>> like memcg might be involved. I'm running F15, so it might even be
>> doing something.
>>
>
> Hmm, what kernel version do you use ?
> I think memcg is not guilty because RIP is shrink_page_list().
> But ok, I'll dig this. Could you give us your .config ?
Attached.
The address in shrink_page_list is ud2, from (I think)
VM_BUG_ON(PageActive(page));. The sequence is:
0xffffffff810d24cc <+202>: callq 0xffffffff810cf930 <test_and_set_bit>
0xffffffff810d24d1 <+207>: test %eax,%eax
0xffffffff810d24d3 <+209>: jne 0xffffffff810d2aa5 <shrink_page_list+1699>
0xffffffff810d24d9 <+215>: mov -0x28(%rbx),%rax
0xffffffff810d24dd <+219>: test $0x40,%al
0xffffffff810d24df <+221>: je 0xffffffff810d24e3 <shrink_page_list+225>
0xffffffff810d24e1 <+223>: ud2
--Andy
Download attachment ".config" of type "application/octet-stream" (88497 bytes)
Powered by blists - more mailing lists