linux-kernel - Re: Accounting problem of MIGRATE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHGf_=rJK_RV2UmaFCTjtd6taKVXZCKYz66TwPfSRCqcUo=PqQ@mail.gmail.com>
Date:	Fri, 22 Jun 2012 04:13:47 -0400
From:	KOSAKI Motohiro <kosaki.motohiro@...il.com>
To:	Aaditya Kumar <aaditya.kumar.30@...il.com>
Cc:	Minchan Kim <minchan@...nel.org>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Mel Gorman <mel@....ul.ie>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>, tim.bird@...sony.com,
	frank.rowand@...sony.com, takuzo.ohara@...sony.com,
	kan.iibuchi@...sony.com, aaditya.kumar@...sony.com
Subject: Re: Accounting problem of MIGRATE_ISOLATED freed page

On Fri, Jun 22, 2012 at 3:56 AM, Aaditya Kumar
<aaditya.kumar.30@...il.com> wrote:
> On Fri, Jun 22, 2012 at 12:52 PM, KOSAKI Motohiro
> <kosaki.motohiro@...il.com> wrote:
>>> Let me summary again.
>>>
>>> The problem:
>>>
>>> when hotplug offlining happens on zone A, it starts to freed page as MIGRATE_ISOLATE type in buddy.
>>> (MIGRATE_ISOLATE is very irony type because it's apparently on buddy but we can't allocate them)
>>> When the memory shortage happens during hotplug offlining, current task starts to reclaim, then wake up kswapd.
>>> Kswapd checks watermark, then go sleep BECAUSE current zone_watermark_ok_safe doesn't consider
>>> MIGRATE_ISOLATE freed page count. Current task continue to reclaim in direct reclaim path without kswapd's help.
>>> The problem is that zone->all_unreclaimable is set by only kswapd so that current task would be looping forever
>>> like below.
>>>
>>> __alloc_pages_slowpath
>>> restart:
>>>        wake_all_kswapd
>>> rebalance:
>>>        __alloc_pages_direct_reclaim
>>>                do_try_to_free_pages
>>>                        if global_reclaim && !all_unreclaimable
>>>                                return 1; /* It means we did did_some_progress */
>>>        skip __alloc_pages_may_oom
>>>        should_alloc_retry
>>>                goto rebalance;
>>>
>>> If we apply KOSAKI's patch[1] which doesn't depends on kswapd about setting zone->all_unreclaimable,
>>> we can solve this problem by killing some task. But it doesn't wake up kswapd, still.
>>> It could be a problem still if other subsystem needs GFP_ATOMIC request.
>>> So kswapd should consider MIGRATE_ISOLATE when it calculate free pages before going sleep.
>>
>> I agree. And I believe we should remove rebalance label and alloc
>> retrying should always wake up kswapd.
>> because wake_all_kswapd is unreliable, it have no guarantee to success
>> to wake up kswapd. then this
>> micro optimization is NOT optimization. Just trouble source. Our
>> memory reclaim logic has a lot of race
>> by design. then any reclaim code shouldn't believe some one else works fine.
>>
>
> I think this is a better approach, since MIGRATE_ISLOATE is really a
> temporary phenomenon, it makes sense to just retry allocation.
> One issue however, with this approach is that it does not exactly work
> for PAGE_ALLOC_COSTLY_ORDER, But well, given the
> frequency of such allocation, I think may be it is an acceptable
> compromise to handle such request by OOM in case of many
> MIGRATE_ISOLATE
> pages present.
>
> what do you think ?

I think we need both change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/