linux-kernel - Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100908021341.GA6182@localhost>
Date:	Wed, 8 Sep 2010 10:13:41 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Dave Chinner <david@...morbit.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mel@....ul.ie>,
	Linux Kernel List <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Minchan Kim <minchan.kim@...il.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after
 direct reclaim allocation fails

On Tue, Sep 07, 2010 at 10:23:48PM +0800, Christoph Lameter wrote:
> On Mon, 6 Sep 2010, Dave Chinner wrote:
> 
> > [  596.628086]  [<ffffffff81108a8c>] ? drain_all_pages+0x1c/0x20
> > [  596.628086]  [<ffffffff81108fad>] ? __alloc_pages_nodemask+0x42d/0x700
> > [  596.628086]  [<ffffffff8113d0f2>] ? kmem_getpages+0x62/0x160
> > [  596.628086]  [<ffffffff8113dce6>] ? fallback_alloc+0x196/0x240
> 
> fallback_alloc() showing up here means that one page allocator call from
> SLAB has already failed.

That may be due to the GFP_THISNODE flag which includes __GFP_NORETRY
which may fail the allocation simply because there are many concurrent
page allocating tasks, but not necessary in real short of memory.

The concurrent page allocating tasks may consume all the pages freed
by try_to_free_pages() inside __alloc_pages_direct_reclaim(), before
the direct reclaim task is able to get it's page with
get_page_from_freelist(). Then should_alloc_retry() returns 0 for
__GFP_NORETRY which stops further retries.

In theory, __GFP_NORETRY might fail even without other tasks
concurrently stealing current task's direct reclaimed pages. The pcp
lists might happen to be low populated (pcp.count ranges 0 to pcp.batch),
and try_to_free_pages() might not free enough pages to fill them to
the pcp.high watermark, hence no pages are freed into the buddy system
and NR_FREE_PAGES increased. Then zone_watermark_ok() will remain
false and allocation fails. Mel's patch to increase accuracy of
zone_watermark_ok() should help this case.

> SLAB then did an expensive search through all
> object caches on all nodes to find some available object. There were no
> objects in queues at all therefore SLAB called the page allocator again
> (kmem_getpages()).
> 
> As soon as memory is available (on any node or any cpu, they are all
> empty) SLAB will repopulate its queues(!).

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/