[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTimbu0pDNb1cHGu0B6P-foRHQ2uiWw@mail.gmail.com>
Date: Tue, 24 May 2011 17:57:07 +0900
From: Minchan Kim <minchan.kim@...il.com>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc: abarry@...y.com, akpm@...ux-foundation.org, linux-mm@...ck.org,
mgorman@...e.de, riel@...hat.com, hannes@...xchg.org,
linux-kernel@...r.kernel.org, fengguang.wu@...el.com
Subject: Re: Unending loop in __alloc_pages_slowpath following OOM-kill; rfc: patch.
On Tue, May 24, 2011 at 5:41 PM, KOSAKI Motohiro
<kosaki.motohiro@...fujitsu.com> wrote:
>>> I'm sorry I missed this thread long time.
>>
>> No problem. It would be better than not review.
>
> thx.
>
>
>>> In this case, I think we should call drain_all_pages(). then following
>>> patch is better.
>>
>> Strictly speaking, this problem isn't related to drain_all_pages.
>> This problem caused by lru empty but I admit it could work well if
>> your patch applied.
>> So yours could help, too.
>>
>>> However I also think your patch is valuable. because while the task is
>>> sleeping in wait_iff_congested(), an another task may free some pages.
>>> thus, rebalance path should try to get free pages. iow, you makes sense.
>>
>> Yes.
>> Off-topic.
>> I would like to move cond_resched below get_page_from_freelist in
>> __alloc_pages_direct_reclaim. Otherwise, it is likely we can be stolen
>> pages to other processes.
>> One more benefit is that if it's apparently OOM path(ie,
>> did_some_progress = 0), we can reduce OOM kill latency due to remove
>> unnecessary cond_resched.
>
> I agree. Can you please mind to send a patch?
I had but at that time, Andrew had a concern.
I will resend it when I have a time. Let's discuss, again.
>
>
>>> So, I'd like to propose to merge both your and my patch.
>>
>> Recently, there was discussion on drain_all_pages with Wu.
>> He saw much overhead in 8-core system, AFAIR.
>> I Cced Wu.
>>
>> How about checking per-cpu before calling drain_all_pages() than
>> unconditional calling?
>> if (per_cpu_ptr(zone->pageset, smp_processor_id())
>> drain_all_pages();
>>
>> Of course, It can miss other CPU free pages. But above routine assume
>> local cpu direct reclaim is successful but it failed by per-cpu. So I
>> think it works.
>
> Can you please tell me previous discussion url or mail subject?
> I mean, if it is costly and performance degression risk, we don't have to
> take my idea.
Yes. You could see it by https://lkml.org/lkml/2011/4/30/81.
>
> Thanks.
>
>
>>
>> Thanks for good suggestion and Reviewed-by, KOSAKI.
>
>
>
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists