lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 17 Feb 2010 16:45:54 -0800
From:	Michael Bohan <mbohan@...eaurora.org>
To:	linux-mm@...ck.org
CC:	linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Kernel panic due to page migration accessing memory holes

Hi,

I have encountered a kernel panic on the ARM/msm platform in the mm 
migration code on 2.6.29.  My memory configuration has two discontiguous 
banks per our ATAG definition.   These banks end up on addresses that 
are 1 MB aligned.  I am using FLATMEM (not SPARSEMEM), but my 
understanding is that SPARSEMEM should not be necessary to support this 
configuration.  Please correct me if I'm wrong.

The crash occurs in mm/page_alloc.c:move_freepages() when being passed a 
start_page that corresponds to the last several megabytes of our first 
memory bank.  The code in move_freepages_block() aligns the passed in 
page number to pageblock_nr_pages, which corresponds to 4 MB.  It then 
passes that aligned pfn as the beginning of a 4 MB range to 
move_freepages().  The problem is that since our bank's end address is 
not 4 MB aligned, the range passed to move_freepages() exceeds the end 
of our memory bank.  The code later blows up when trying to access 
uninitialized page structures.

As a temporary fix, I added some code to move_freepages_block() that 
inspects whether the range exceeds our first memory bank -- returning 0 
if it does.  This is not a clean solution, since it requires exporting 
the ARM specific meminfo structure to extract the bank information.

I see an option exists called CONFIG_HOLES_IN_ZONE, which has control 
over the definition of pfn_valid_within() used in move_freepages().  
This option seems relevant to the problem.  The ia64 architecture has a 
special version of pfn_valid() called ia64_pfn_valid() that is used in 
conjunction with this option.  It appears to inspect the page 
structure's state in a safe way that does not cause a crash, and can 
presumably be used to determine whether the page structure is 
initialized properly.  The ARM version of pfn_valid() used in the 
FLATMEM scenario does not appear to be memory hole aware, and will 
blindly return true in this case.

I have looked on linux-next, and at least the functions mentioned above 
have not changed.

I was curious if there is a stated requirement where memory banks must 
end on 4 MB aligned addresses.  Although I found this problem on ARM, it 
appears upon inspection that the problem could occur on other 
architectures as well, given the memory map assumptions stated above.  
I'm hoping that some mm experts might understand the problem in greater 
detail.

Thanks,
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists