[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A5E9E4E.5000308@redhat.com>
Date: Wed, 15 Jul 2009 23:28:14 -0400
From: Rik van Riel <riel@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
Wu Fengguang <fengguang.wu@...el.com>
Subject: Re: [PATCH -mm] throttle direct reclaim when too many pages are isolated
already
Andrew Morton wrote:
> On Wed, 15 Jul 2009 23:10:43 -0400 Rik van Riel <riel@...hat.com> wrote:
>
>> Andrew Morton wrote:
>>> On Wed, 15 Jul 2009 22:38:53 -0400 Rik van Riel <riel@...hat.com> wrote:
>>>
>>>> When way too many processes go into direct reclaim, it is possible
>>>> for all of the pages to be taken off the LRU. One result of this
>>>> is that the next process in the page reclaim code thinks there are
>>>> no reclaimable pages left and triggers an out of memory kill.
>>>>
>>>> One solution to this problem is to never let so many processes into
>>>> the page reclaim path that the entire LRU is emptied. Limiting the
>>>> system to only having half of each inactive list isolated for
>>>> reclaim should be safe.
>>>>
>>> Since when? Linux page reclaim has a bilion machine years testing and
>>> now stuff like this turns up. Did we break it or is this a
>>> never-before-discovered workload?
>> It's been there for years, in various forms. It hardly ever
>> shows up, but Kosaki's patch series give us a nice chance to
>> fix it for good.
>
> OK.
>
>>>> @@ -1049,6 +1070,10 @@ static unsigned long shrink_inactive_lis
>>>> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
>>>> int lumpy_reclaim = 0;
>>>>
>>>> + while (unlikely(too_many_isolated(zone, file))) {
>>>> + schedule_timeout_interruptible(HZ/10);
>>>> + }
>>> This (incorrectly-laid-out) code is a no-op if signal_pending().
>> Good point, I should add some code to break out of page reclaim
>> if a fatal signal is pending,
>
> We can't just return NULL from __alloc_pages(), and if we can't
> get a page from the freelists then we're just going to have to keep
> reclaiming. So I'm not sure how we can do this.
If we are stuck at this point in the page reclaim code,
it is because too many other tasks are reclaiming pages.
That makes it fairly safe to just return SWAP_CLUSTER_MAX
here and hope that __alloc_pages() can get a page.
After all, if __alloc_pages() thinks it made progress,
but still cannot make the allocation, it will call the
pageout code again.
--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists