[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1179776038.5735.39.camel@lappy>
Date: Mon, 21 May 2007 21:33:58 +0200
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
To: Christoph Lameter <clameter@....com>
Cc: Matt Mackall <mpm@...enic.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Thomas Graf <tgraf@...g.ch>,
David Miller <davem@...emloft.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Daniel Phillips <phillips@...gle.com>,
Pekka Enberg <penberg@...helsinki.fi>,
Paul Jackson <pj@....com>, npiggin@...e.de
Subject: Re: [PATCH 0/5] make slab gfp fair
On Mon, 2007-05-21 at 09:45 -0700, Christoph Lameter wrote:
> On Sun, 20 May 2007, Peter Zijlstra wrote:
>
> > I care about kernel allocations only. In particular about those that
> > have PF_MEMALLOC semantics.
>
> Hmmmm.. I wish I was more familiar with PF_MEMALLOC. ccing Nick.
>
> > - set page->reserve nonzero for each page allocated with
> > ALLOC_NO_WATERMARKS; which by the previous point implies that all
> > available zones are below ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER
>
> Ok that adds a new field to the page struct. I suggested a page flag in
> slub before.
No it doesn't; it overloads page->index. Its just used as extra return
value, it need not be persistent. Definitely not worth a page-flag.
> > - when a page->reserve slab is allocated store it in s->reserve_slab
> > and do not update the ->cpu_slab[] (this forces subsequent allocs to
> > retry the allocation).
>
> Right that should work.
>
> > All ALLOC_NO_WATERMARKS enabled slab allocations are served from
> > ->reserve_slab, up until the point where a !page->reserve slab alloc
> > succeeds, at which point the ->reserve_slab is pushed into the partial
> > lists and ->reserve_slab set to NULL.
>
> So the original issue is still not fixed. A slab alloc may succeed without
> watermarks if that particular allocation is restricted to a different set
> of nodes. Then the reserve slab is dropped despite the memory scarcity on
> another set of nodes?
I can't see how. This extra ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER alloc will
first deplete all other zones. Once that starts failing no node should
still have pages accessible by any allocation context other than
PF_MEMALLOC.
> > Since only the allocation of a new slab uses the gfp zone flags, and
> > other allocations placement hints they have to be uniform over all slab
> > allocs for a given kmem_cache. Thus the s->reserve_slab/page->reserve
> > status is kmem_cache wide.
>
> No the gfp zone flags are not uniform and placement of page allocator
> allocs through SLUB do not always have the same allocation constraints.
It has to; since it can serve the allocation from a pre-existing slab
allocation. Hence any page allocation must be valid for all other users.
> SLUB will check the node of the page that was allocated when the page
> allocator returns and put the page into that nodes slab list. This varies
> depending on the allocation context.
Yes, it keeps slabs on per node lists. I'm just not seeing how this puts
hard constraints on the allocations.
As far as I can see there cannot be a hard constraint here, because
allocations form interrupt context are at best node local. And node
affine zone lists still have all zones, just ordered on locality.
> Allocations can be particular to uses of a slab in particular situations.
> A kmalloc cache can be used to allocate from various sets of nodes in
> different circumstances. kmalloc will allow serving a limited number of
> objects from the wrong nodes for performance reasons but the next
> allocation from the page allocator (or from the partial lists) will occur
> using the current set of allowed nodes in order to ensure a rough
> obedience to the memory policies and cpusets. kmalloc_node behaves
> differently and will enforce using memory from a particular node.
>>From what I can see, it takes pretty much any page it can get once you
hit it with PF_MEMALLOC. If the page allocation doesn't use ALLOC_CPUSET
the page can come from pretty much anywhere.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists