[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84144f020906241000l5870771fp262444cbc1840653@mail.gmail.com>
Date: Wed, 24 Jun 2009 20:00:04 +0300
From: Pekka Enberg <penberg@...helsinki.fi>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Arjan van de Ven <arjan@...radead.org>,
linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
Christoph Lameter <cl@...ux-foundation.org>,
Nick Piggin <npiggin@...e.de>
Subject: Re: upcoming kerneloops.org item: get_page_from_freelist
On Wed, Jun 24, 2009 at 7:56 PM, Pekka Enberg<penberg@...helsinki.fi> wrote:
> On Wed, Jun 24, 2009 at 7:55 PM, Pekka Enberg<penberg@...helsinki.fi> wrote:
>> Hi Andrew,
>>
>> On Wed, 24 Jun 2009 08:07:53 -0700 Arjan van de Ven <arjan@...radead.org> wrote:
>>>> a new item is coming up fast in the kerneloops.org stats, and it's new
>>>> in 2.6.31-rc;
>>>>
>>>> http://www.kerneloops.org/searchweek.php?search=get_page_from_freelist
>>>>
>>>> it's this warning in mm/page_alloc.c:
>>>>
>>>> * __GFP_NOFAIL is not to be used in new code.
>>>> *
>>>> * All __GFP_NOFAIL callers should be fixed so that they
>>>> * properly detect and handle allocation failures.
>>>> *
>>>> * We most definitely don't want callers attempting to
>>>> * allocate greater than single-page units with
>>>> * __GFP_NOFAIL.
>>>> */
>>>> WARN_ON_ONCE(order > 0);
>>>>
>>>>
>>>> typical backtraces look like
>>>>
>>>> get_page_from_freelist
>>>> __alloc_pages_nodemask
>>>> alloc_pages_current
>>>> alloc_slab_page
>>>> new_slab
>>>> __slab_alloc
>>>> kmem_cache_alloc_notrace
>>>> start_this_handle
>>>> jbd2_journal_start
>>>>
>>>> and
>>>>
>>>> get_page_from_freelist
>>>> __alloc_pages_nodemask
>>>> alloc_pages_current
>>>> alloc_slab_page
>>>> new_slab
>>>> __slab_alloc
>>>> kmem_cache_alloc_notrace
>>>> start_this_handle
>>>> journal_start
>>>> ext3_journal_start_sb
>>>> ext3_journal_start
>>>> ext3_dirty_inode
>>>>
>>>> but there are some other ones as well at the url above.
>>>>
>>>>
>>>> git blame shows that
>>>>
>>>> commit dab48dab37d2770824420d1e01730a107fade1aa
>>>> Author: Andrew Morton <akpm@...ux-foundation.org>
>>>> Date: Tue Jun 16 15:32:37 2009 -0700
>>>>
>>>> introduced this WARN_ON.....
>>
>> On Wed, Jun 24, 2009 at 7:46 PM, Andrew Morton<akpm@...ux-foundation.org> wrote:
>>> Well yes. Using GFP_NOFAIL on a higher-order allocation is bad. This
>>> patch is there to find, name, shame, blame and hopefully fix callers.
>>>
>>> A fix for cxgb3 is in the works. slub's design is a big problem.
>>>
>>> But we'll probably have to revert it for 2.6.31 :(
>>
>> How is SLUB's design a problem here? Can't we just clear GFP_NOFAIL
>> from the higher order allocation and thus force GFP_NOFAIL allocations
>> to use the minimum required order?
>
> Small correction: force GFP_NOFAIL allocations to use minimum order
> _if_ the higher order allocation fails.
And here's a badly linewrapped, untested patch to do that (sorry I
don't have my laptop here). Christoph, does this look ok to you?
diff --git a/mm/slub.c b/mm/slub.c
index ce62b77..8aaf0fa 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1088,8 +1088,7 @@ static struct page *allocate_slab(struct
kmem_cache *s, gfp_t flags, int node)
flags |= s->allocflags;
- page = alloc_slab_page(flags | __GFP_NOWARN | __GFP_NORETRY, node,
- oo);
+ page = alloc_slab_page(flags & ~__GFP_NOFAIL | __GFP_NOWARN |
__GFP_NORETRY, node, oo);
if (unlikely(!page)) {
oo = s->min;
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists