lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20100219134833.GE30258@csn.ul.ie>
Date:	Fri, 19 Feb 2010 13:48:33 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Michael Bohan <mbohan@...eaurora.org>
Cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	linux-arm-msm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: Kernel panic due to page migration accessing memory holes

On Thu, Feb 18, 2010 at 05:47:28PM -0800, Michael Bohan wrote:
> On 2/18/2010 2:04 AM, Mel Gorman wrote:
>> On Thu, Feb 18, 2010 at 06:36:04PM +0900, KAMEZAWA Hiroyuki wrote:
>>    
>>>   [Fact]
>>>   - There are 2 banks of memory and a memory hole on your machine.
>>>     As
>>>           0x00200000 - 0x07D00000
>>>           0x40000000 - 0x43000000
>>>
>>>   - Each bancks are in the same zone.
>>>   - You use FLATMEM.
>>>   - You see panic in move_freepages().
>>>   - Your host's MAX_ORDER=11....buddy allocator's alignment is 0x400000
>>>     Then, it seems 1st bank is not algined.
>>>      
>> It's not and assumptions are made about it being aligned.
>>    
>
> Would it be prudent to have the ARM mm init code detect unaligned,  
> discontiguous banks and print a warning message if  
> CONFIG_ARCH_HAS_HOLES_MEMORYMODEL is not configured?  Should we take it  
> a step further and even BUG()?
>

I guess it wouldn't hurt. I wouldn't get too side-tracked though as it's
not the most important issue here.

>> ARM frees unused portions of memmap to save memory. It's why memmap_valid_within()
>> exists when CONFIG_ARCH_HAS_HOLES_MEMORYMODEL although previously only
>> reading /proc/pagetypeinfo cared.
>>
>> In that case, the FLATMEM memory map had unexpected holes which "never"
>> happens and that was the workaround. The problem here is that there are
>> unaligned zones but no pfn_valid() implementation that can identify
>> them as you'd have with SPARSEMEM. My expectation is that you are using
>> the pfn_valid() implementation from asm-generic
>>
>> #define pfn_valid(pfn)          ((pfn)<  max_mapnr)
>>
>> which is insufficient in your case.
>>    
>
> I am actually using the pfn_valid implementation FLATMEM in  
> arch/arm/include/asm/memory.h.  This one is very similar to the  
> asm-generic, and has no knowledge of the holes.
>

Same problem applies so.

>> I think it's more likely the at the memmap he is accessing has been
>> freed and is effectively random data.
>>
>>    
>
> I also think this is the case.
>
>> SPARSEMEM would give you an implementation of pfn_valid() that you could
>> use here. The choices that spring to mind are;
>>
>> 1. reduce MAX_ORDER so they are aligned (easiest)
>>    
>
> Is it safe to assume that reducing MAX_ORDER will hurt performance?
>

No, it does not necessarily reduce performance. In some circumstances it
might even help although I wouldn't chase after it.

Downside one is that some hash tables might be getting hurt if you have a
very large amount of memory (look for "hash table entries:" in dmesg after
booting to see what order is being used).

Downside two is that if some drivers require large contiguous memory
early in boot, they might be hurt by MAX_ORDER being lower. If you
require CONFIG_HUGETLB_PAGE, it might not be possible to reduce
MAX_ORDER depending on the size of the huge page.

>> 2. use SPARSEMEM (easy, but not necessary what you want to do, might
>> 	waste memory unless you drop MAX_ORDER as well)
>>    
>
> We intend to use SPARSEMEM, but we'd also like to maintain FLATMEM  
> compatibility for some configurations.  My guess is that there are other  
> ARM users that may want this support as well.
>
>> 3. implement a pfn_valid() that can handle the holes and set
>> 	CONFIG_HOLES_IN_ZONE so it's called in move_freepages() to
>> 	deal with the holes (should pass this by someone more familiar
>> 	with ARM than I)
>>    
>
> This option seems the best to me.  We should be able to implement an ARM  
> specific pfn_valid() that walks the ARM meminfo struct to ensure the pfn  
> is not within a hole.
>

Be sure to check your performance before and after. pfn_valid_within()
is used in a fair few places and you are likely enabling it.

> My only concern with this is a comment in __rmqueue_fallback() after  
> calling move_freepages_block()  that states "Claim the whole block if  
> over half of it is free".  Suppose only 1 MB is beyond the bank limit.   
> That means that over half of the pages of the 4 MB block will be  
> reported by move_freepages() as free -- but 1 MB of those pages are  
> invalid.  Won't this cause problems if these pages are assumed to be  
> part of an active block?
>

The only operation taking place there is updating a bitmap so I doubt
you'll hit snags there.

> It seems like we should have an additional check in  
> move_freepages_block() with pfn_valid_within() to check the last page in  
> the block (eg. end_pfn) before calling move_freepages_block().  If the  
> last page is not valid, then we shouldn't we return 0 as in the zone  
> span check? This will also skip the extra burden of checking each  
> individual page, when we already know the proposed range is invalid.
>

You don't know where the holes are going to be so it is paranod rather
than making assumptions about where architectures put holes.

> Assuming we did return 0 in this case, would that sub-block of pages  
> ever be usable for anything else, or would it be effectively wasted? 

They're still usable.

> If  
> this memory were wasted, then adjusting MAX_ORDER would have an  
> advantage in this sense -- ignoring any performance implications.
>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ