lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20130322.161121.07584638.d.hatayama@jp.fujitsu.com>
Date:	Fri, 22 Mar 2013 16:11:21 +0900 (JST)
From:	HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
To:	vgoyal@...hat.com
Cc:	ebiederm@...ssion.com, cpw@....com,
	kumagai-atsushi@....nes.nec.co.jp, lisa.mitchell@...com,
	heiko.carstens@...ibm.com, akpm@...ux-foundation.org,
	kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
	zhangyanfei@...fujitsu.com
Subject: Re: [PATCH v3 18/21] vmcore: check if vmcore objects satify
 mmap()'s page-size boundary requirement

From: Vivek Goyal <vgoyal@...hat.com>
Subject: Re: [PATCH v3 18/21] vmcore: check if vmcore objects satify mmap()'s page-size boundary requirement
Date: Thu, 21 Mar 2013 10:49:29 -0400

> On Thu, Mar 21, 2013 at 12:22:59AM -0700, Eric W. Biederman wrote:
>> HATAYAMA Daisuke <d.hatayama@...fujitsu.com> writes:
>> 
>> > OK, rigorously, suceess or faliure of the requested free pages
>> > allocation depends on actual memory layout at the 2nd kernel boot. To
>> > increase the possibility of allocating memory, we have no method but
>> > reserve more memory for the 2nd kernel now.
>> 
>> Good enough.   If there are fragmentation issues that cause allocation
>> problems on larger boxes we can use vmalloc and remap_vmalloc_range, but
>> we certainly don't need to start there.
>> 
>> Especialy as for most 8 or 16 core boxes we are talking about a 4KiB or
>> an 8KiBP allocation.  Aka order 0 or order 1.
>> 
> 
> Actually we are already handling the large SGI machines so we need
> to plan for 4096 cpus now while we write these patches.
> 
> vmalloc() and remap_vmalloc_range() sounds reasonable. So that's what
> we should probaly use.
> 
> Alternatively why not allocate everything in 4K pages and use vmcore_list
> to map offset into right addresses and call remap_pfn_range() on these
> addresses.

I have an introductory question about design of vmalloc. My
understanding is that vmalloc allocates *pages* enough to cover a
requested size and returns the first corresponding virtual address.
So, the address returned is inherently always page-size aligned.

It looks like vmalloc does so in the current implementation, but I
don't know older implementations and I cannot make sure this is
guranteed in vmalloc's interface. There's the comment explaing the
interface of vmalloc as below, but it seems to me a little vague in
that it doesn't say clearly what's is returned as an address.

/**
 *      vmalloc  -  allocate virtually contiguous memory
 *      @size:          allocation size
 *      Allocate enough pages to cover @size from the page level
 *      allocator and map them into contiguous kernel virtual space.
 *
 *      For tight control over page level allocator and protection flags
 *      use __vmalloc() instead.
 */
void *vmalloc(unsigned long size)
{
        return __vmalloc_node_flags(size, NUMA_NO_NODE,
                                    GFP_KERNEL | __GFP_HIGHMEM);
}
EXPORT_SYMBOL(vmalloc);

BTW, simple test module code also shows they returns page-size aligned
objects, where 1-byte objects are allocated 12-times.

$ dmesg | tail -n 12
[3552817.290982] test: objects[0] = ffffc9000060c000
[3552817.291197] test: objects[1] = ffffc9000060e000
[3552817.291379] test: objects[2] = ffffc9000067d000
[3552817.291566] test: objects[3] = ffffc90010f99000
[3552817.291833] test: objects[4] = ffffc90010f9b000
[3552817.292015] test: objects[5] = ffffc90010f9d000
[3552817.292207] test: objects[6] = ffffc90010f9f000
[3552817.292386] test: objects[7] = ffffc90010fa1000
[3552817.292574] test: objects[8] = ffffc90010fa3000
[3552817.292785] test: objects[9] = ffffc90010fa5000
[3552817.292964] test: objects[10] = ffffc90010fa7000
[3552817.293143] test: objects[11] = ffffc90010fa9000

Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ