linux-kernel - Re: [PATCH 5 of 6] hotplug-memory: add section

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 04 Apr 2008 11:21:18 -0700
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Dave Hansen <dave@...ux.vnet.ibm.com>
CC:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Yasunori Goto <y-goto@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Christoph Lameter <clameter@....com>
Subject: Re: [PATCH 5 of 6] hotplug-memory: add section_ops

Dave Hansen wrote:
> I disagree that it is redundant.  There is actually a lot of power
> there, and it was done for some very specific reasons.  They're actually
> especially important to your particular use, which I'll try to explain.
>
> Say that you have a very fragmented lowmem-only machine and you want to
> add a section of new mem_map[].  That's 512MB/PAGE_SIZE*sizeof(struct
> page), which is between 4 and 8MB.  So, you need 8MB of contiguous pages
> for that mem_map[].  The way we designed it, you could always have that
> extra section sitting there, unused.  You have that overhead of 8MB
> sitting there, but the benefit is that you can *ALWAYS* add memory.
>
> I worry that the current patches as posted don't allow this kind of
> behavior.  If there was ever a need to preallocate the space, it would
> need to do it inside the balloon driver itself and a way found to pass
> that into the hotplug code.
>   

Yes, there's a chicken-egg problem if you want to use the new memory to 
describe itself.  But in the x86-32 case, hotplug memory is always 
highmem, and so can't be used for page structures, right?

> It is especially important for Xen because all of the other
> architectures, when getting a new section, magically have a large new
> contiguous area out of which to get a new mem_map[] or other structures.
> Xen doesn't actually have any contiguous memory at all since it will be
> adding a page at a time.  Could you add some new functionality to the
> Xen driver and make sure to allocate the new section's mem_map[] inside
> the section?  That would get around the problem.
>   

At the time the balloon driver decides it needs some new page 
structures, it has some new pages in hand.  If it knew where to put 
them, it could map those new pages as the seed of a new section.   The 
problem is guaranteeing enough pages to be able to populate the 
section's memmap - its a bit of a catch-22 if you can't hotplug new 
memory unless you can allocate at least 4-8MB.

>>> That might also be generalizable to anyone else that wants the "fresh
>>> meat" newly-hotplugged memory.  Large page users might be other
>>> consumers here.
>>>       
>> Sure.  The main problem is that 1-3 also ends up implicitly registering 
>> the new section with sysfs, so the bulk online interface becomes to 
>> usermode.  If we make that optional (ie, a separate explicit call the 
>> memory-adder can choose to take) then everything is rosy.
>>     
>
> OK, but the registering isn't a bad thing, either.  You just don't want
> all of the pages to be stuck back into the allocator.  You want to suck
> them into the balloon driver.  Right?
>   

Yeah.  And I have no objection to things appearing in /sys, or even 
making the "state" files do something sensible if you write to them.  I 
just don't want to make it too easy to shoot yourself in the foot.

> Let's put a hook in online_page() to divert the new pages into your
> driver instead of into the allocator.  Problem solved, right?
>
> We might do this with a free_hotplugged_page() function.  It can, in the
> initial introduction patch, just call free_page().  You can modify it
> later to do your bidding.  
>   

Sure.  That's more or less what the ops patch lets me do.  The question 
is whether the right answer is:

   1. add an ops structure to struct mem_section
   2. pass an "online_page" function pointer into online_pages, or
   3. have a special-case Xen hook in online_page

1 I've already implemented, Kame suggested 2, and we also mentioned 3 in 
passing.

I think 2 is looking like a nice tradeoff between complexity and 
flexibility.

> This sounds like a tall order considering Xen's current tight coupling
> between memory mapping and allocation, but I trust you and your
> colleagues to get it done. :)
>   

The combination of needing to map pagetables RO and grant tables for IO 
makes it unlikely that we'll be able to map the kernel space with large 
pages in the short term.  But it does allow usermode to use large pages, 
and maybe it would be worth trying to work out how to reconfigure the 
kernel address space to separate kernel code/data from pagetable and IO 
mappings...

    J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/