[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1104211553390.9496@router.home>
Date: Thu, 21 Apr 2011 16:07:43 -0500 (CDT)
From: Christoph Lameter <cl@...ux.com>
To: James Bottomley <James.Bottomley@...senPartnership.com>
cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
David Rientjes <rientjes@...gle.com>,
Pekka Enberg <penberg@...nel.org>,
Michal Hocko <mhocko@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>,
linux-parisc@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
x86 maintainers <x86@...nel.org>, Tejun Heo <tj@...nel.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
Mel Gorman <mel@....ul.ie>
Subject: Re: [PATCH v3] mm: make expand_downwards symmetrical to
expand_upwards
On Thu, 21 Apr 2011, James Bottomley wrote:
> > Dave Hansen, Mel: Can you provide us with some help? (Its Easter and so
> > the europeans may be off for awhile)
>
> It sort of depends on your definition of easy. The problem going from
> DISCONTIGMEM to SPARSEMEM is sorting out the section size (the minimum
> indivisible size for a sectional_mem_map array) and also deciding on
> whether you need SPARSEMEM_EXTREME (discontigmem allows arbitrarily
> different sizes for each contiguous region) or
> ARCH_HAS_HOLES_MEMORYMODEL (allows empty mem_map regions as well). I
> suspect most architectures will want SPARSEMEM_EXTREME (it means that
> the section array isn't fully populated) because the gaps can be huge
> (we've got a 64GB gap on parisc).
Well my favorite is SPARSEMEM_VMEMMAP because it allows page level holes
and uses the TLB (via page tables) to avoid lookups in the SPARSE maps but
that is likely not going to be in an initial fix.
> However, even though I think we can do this going forwards ... I don't
> think we can backport it as a bug fix for the slub panic.
So far there seems to be no other solution that will fix the issues
cleanly since we have a clash of the notions of a node in !NUMA between
core and discontig. Which is a pretty basic thing to get wrong.
If we can avoid all the fancy stuff and Dave can just get a minimal SPARSE
config going then this may be the best solution for stable as well.
But then these configs have been broken for years and no one noticed. This
means the users of these arches likely have been running a subset of
kernel functionality. I suspect they have never freed memory from
DISCONTIG node 1 and higher without CONFIG_DEBUG_VM on. Otherwise I
cannot explain why the VM_BUG_ONs did not trigger in
mm/page_alloc.c:move_freepages() that should have been brought to the MM
developers attention.
This set of circumstances leads to the suspicion that there were only
tests run that showed that the kernel booted. Higher node memory was never
touched and the MM code was never truly exercised.
So I am not sure that there is any urgency in this matter. No one has
cared for years after all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists