[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50DD9EA7.6050309@iskon.hr>
Date: Fri, 28 Dec 2012 14:29:11 +0100
From: Zlatko Calusic <zlatko.calusic@...on.hr>
To: Minchan Kim <minchan@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>, linux-mm <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Zhouping Liu <zliu@...hat.com>,
Sedat Dilek <sedat.dilek@...il.com>
Subject: Re: [PATCH] mm: fix null pointer dereference in wait_iff_congested()
On 28.12.2012 03:49, Minchan Kim wrote:
> Hello Zlatko,
>
> On Fri, Dec 28, 2012 at 03:16:38AM +0100, Zlatko Calusic wrote:
>> From: Zlatko Calusic <zlatko.calusic@...on.hr>
>>
>> The unintended consequence of commit 4ae0a48b is that
>> wait_iff_congested() can now be called with NULL struct zone*
>> producing kernel oops like this:
>
> For good description, it would be better to write simple pseudo code
> flow to show how NULL-zone pass into wait_iff_congested because
> kswapd code flow is too complex.
>
> As I see the code, we have following line above wait_iff_congested.
>
> if (!unbalanced_zone || blah blah)
> break;
>
> How can NULL unbalanced_zone reach wait_iff_congested?
>
Hello Minchan, and thanks for the comment.
That line was there before commit 4ae0a48b got in, and you're right,
it's what was protecting wait_iff_congested() from being called with
NULL zone*. But then all that logic got colapsed to a simple
pgdat_balanced() call and that's when I introduced the bug, I lost the
protection.
What I _think_ is happening (pseudo code following...) is that after
scanning the zone in the dma->highmem direction, and concluding that all
zones are balanced (unbalanced_zone remains NULL!),
wake_up(&pgdat->pfmemalloc_wait) wakes up a lot of memory hungry
processes (especially true in various aggressive test/benchmarks) that
immediately drain and unbalance one or more zones. Then pgdat_balanced()
call which immediately follows will be false, but we still have
unbalanced_zone = NULL, rememeber? Oops...
But, all that is a speculation that I can't prove atm. Of course, if
anybody thinks that's a credible explanation, I could add it as a commit
comment, or even as a code comment, but I didn't want to be overly
imaginative. The fix itself is simple and real.
Regards,
--
Zlatko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists