[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51C9E422.6040304@sgi.com>
Date: Tue, 25 Jun 2013 11:40:34 -0700
From: Mike Travis <travis@....com>
To: "H. Peter Anvin" <hpa@...or.com>
CC: Yinghai Lu <yinghai@...nel.org>,
Greg KH <gregkh@...uxfoundation.org>,
Nathan Zimmer <nzimmer@....com>, Robin Holt <holt@....com>,
Rob Landley <rob@...dley.net>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
the arch/x86 maintainers <x86@...nel.org>,
linux-doc@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 0/2] Delay initializing of large sections of memory
On 6/25/2013 11:17 AM, H. Peter Anvin wrote:
> On 06/25/2013 10:35 AM, Mike Travis wrote:
>>
>> The two params that I couldn't figure out how to provide except via kernel
>> param option was the memory block size (128M or 2G) and the physical
>> address space per node. The other 3 params can be automatically
>> setup by a script when the total system size is known. As soon as we
>> verify on the 32TB system and surmise what will be needed for 64TB,
>> then those 3 params can probably disappear.
>>
>
> "Setup by script" is a no-go. You *have* the total system size already,
> it is in the e820 tables (anything which isn't in e820 is hotplug, that
> automagically gets deferred.)
Okay, I'll figure something out. If Yinghai's SRAT patch can help
with the node address space, then I might be able to determine if
the system is a UV which is the only system I see that uses 2G
memory blocks. (Or make get_memory_block_size() a global.)
Then a simple param to start the insertion early or defer it until
the system is fully up is still useful, and that's easy to understand.
[I think we still want to keep the actual process of moving memory
to the absent list an option, yes? If for no other reason except
to rule out this code when a problem crops up. Or at least have a
way to disable the process if it's CONFIG'd in.]
>
> However, please consider Ingo's counterproposal of doing this via the
> buddy allocator, i.e. hugepages being broken on demand. That is a
> *very* powerful model, although would require more infrastructure.
We will certainly continue to make improvements as larger system sizes
become more commonplace (and customers continue to complain :). But
we are cutting it close to including this into the nextgen distro
releases, so it would have to be a follow on project. (I've been
working on this patch since last November.)
Thanks,
Mike
>
> -hpa
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists