[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20130322.161121.07584638.d.hatayama@jp.fujitsu.com>
Date: Fri, 22 Mar 2013 16:11:21 +0900 (JST)
From: HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
To: vgoyal@...hat.com
Cc: ebiederm@...ssion.com, cpw@....com,
kumagai-atsushi@....nes.nec.co.jp, lisa.mitchell@...com,
heiko.carstens@...ibm.com, akpm@...ux-foundation.org,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
zhangyanfei@...fujitsu.com
Subject: Re: [PATCH v3 18/21] vmcore: check if vmcore objects satify
mmap()'s page-size boundary requirement
From: Vivek Goyal <vgoyal@...hat.com>
Subject: Re: [PATCH v3 18/21] vmcore: check if vmcore objects satify mmap()'s page-size boundary requirement
Date: Thu, 21 Mar 2013 10:49:29 -0400
> On Thu, Mar 21, 2013 at 12:22:59AM -0700, Eric W. Biederman wrote:
>> HATAYAMA Daisuke <d.hatayama@...fujitsu.com> writes:
>>
>> > OK, rigorously, suceess or faliure of the requested free pages
>> > allocation depends on actual memory layout at the 2nd kernel boot. To
>> > increase the possibility of allocating memory, we have no method but
>> > reserve more memory for the 2nd kernel now.
>>
>> Good enough. If there are fragmentation issues that cause allocation
>> problems on larger boxes we can use vmalloc and remap_vmalloc_range, but
>> we certainly don't need to start there.
>>
>> Especialy as for most 8 or 16 core boxes we are talking about a 4KiB or
>> an 8KiBP allocation. Aka order 0 or order 1.
>>
>
> Actually we are already handling the large SGI machines so we need
> to plan for 4096 cpus now while we write these patches.
>
> vmalloc() and remap_vmalloc_range() sounds reasonable. So that's what
> we should probaly use.
>
> Alternatively why not allocate everything in 4K pages and use vmcore_list
> to map offset into right addresses and call remap_pfn_range() on these
> addresses.
I have an introductory question about design of vmalloc. My
understanding is that vmalloc allocates *pages* enough to cover a
requested size and returns the first corresponding virtual address.
So, the address returned is inherently always page-size aligned.
It looks like vmalloc does so in the current implementation, but I
don't know older implementations and I cannot make sure this is
guranteed in vmalloc's interface. There's the comment explaing the
interface of vmalloc as below, but it seems to me a little vague in
that it doesn't say clearly what's is returned as an address.
/**
* vmalloc - allocate virtually contiguous memory
* @size: allocation size
* Allocate enough pages to cover @size from the page level
* allocator and map them into contiguous kernel virtual space.
*
* For tight control over page level allocator and protection flags
* use __vmalloc() instead.
*/
void *vmalloc(unsigned long size)
{
return __vmalloc_node_flags(size, NUMA_NO_NODE,
GFP_KERNEL | __GFP_HIGHMEM);
}
EXPORT_SYMBOL(vmalloc);
BTW, simple test module code also shows they returns page-size aligned
objects, where 1-byte objects are allocated 12-times.
$ dmesg | tail -n 12
[3552817.290982] test: objects[0] = ffffc9000060c000
[3552817.291197] test: objects[1] = ffffc9000060e000
[3552817.291379] test: objects[2] = ffffc9000067d000
[3552817.291566] test: objects[3] = ffffc90010f99000
[3552817.291833] test: objects[4] = ffffc90010f9b000
[3552817.292015] test: objects[5] = ffffc90010f9d000
[3552817.292207] test: objects[6] = ffffc90010f9f000
[3552817.292386] test: objects[7] = ffffc90010fa1000
[3552817.292574] test: objects[8] = ffffc90010fa3000
[3552817.292785] test: objects[9] = ffffc90010fa5000
[3552817.292964] test: objects[10] = ffffc90010fa7000
[3552817.293143] test: objects[11] = ffffc90010fa9000
Thanks.
HATAYAMA, Daisuke
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists