lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1351245581-16652-1-git-send-email-laijs@cn.fujitsu.com>
Date:	Fri, 26 Oct 2012 17:59:31 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	linux-kernel@...r.kernel.org, Mel Gorman <mgorman@...e.de>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Minchan Kim <minchan@...nel.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Michal Hocko <mhocko@...e.cz>, linux-mm@...ck.org
Subject: [PATCH] page_alloc: fix the incorrect adjustment to zone->present_pages

Current free_area_init_core() has incorrect adjustment code to adjust
->present_pages. It will cause ->present_pages overflow, make the
system unusable(can't create any process/thread in our test) and cause further problem.

Details:
1) Some/many ZONEs don't have memory which is used by memmap.
   { Or all the actual memory used for memmap is much less than the "memmap_pages"
   (memmap_pages = PAGE_ALIGN(span_size * sizeof(struct page)) >> PAGE_SHIFT)
   CONFIG_SPARSEMEM is an example. }

2) incorrect adjustment in free_area_init_core(): zone->present_pages -= memmap_pages
3) but the zone has big hole, it causes the result of zone->present_pages become much smaller
4) when we offline a/several memory section of the zone: zone->present_pages -= offline_size
5) Now, zone->present_pages will/may be *OVERFLOW*.

So the adjustment is dangerous and incorrect.

Addition 1:
And in current kernel, the memmaps have nothing related/bound to any ZONE:
	FLATMEM: global memmap
	CONFIG_DISCONTIGMEM: node-specific memmap
	CONFIG_SPARSEMEM: memorysection-specific memmap
None of them is ZONE-specific memmap, and the memory used for memmap is not bound to any ZONE.
So the adjustment "zone->present_pages -= memmap_pages" subtracts unrelated value
and makes no sense.

Addition 2:
We introduced this adjustment and tried to make page-reclaim/watermark happier,
but the adjustment is wrong in current kernel, and even makes page-reclaim/watermark
worse. It is against its original purpose/reason.

This adjustment is incorrect/buggy, subtracts unrelated value and violates its original
purpose, so we simply remove the adjustment.

CC: Mel Gorman <mgorman@...e.de>
Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
 mm/page_alloc.c |   20 +-------------------
 1 files changed, 1 insertions(+), 19 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bb90971..6bf72e3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4455,30 +4455,12 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
-		unsigned long size, realsize, memmap_pages;
+		unsigned long size, realsize;
 
 		size = zone_spanned_pages_in_node(nid, j, zones_size);
 		realsize = size - zone_absent_pages_in_node(nid, j,
 								zholes_size);
 
-		/*
-		 * Adjust realsize so that it accounts for how much memory
-		 * is used by this zone for memmap. This affects the watermark
-		 * and per-cpu initialisations
-		 */
-		memmap_pages =
-			PAGE_ALIGN(size * sizeof(struct page)) >> PAGE_SHIFT;
-		if (realsize >= memmap_pages) {
-			realsize -= memmap_pages;
-			if (memmap_pages)
-				printk(KERN_DEBUG
-				       "  %s zone: %lu pages used for memmap\n",
-				       zone_names[j], memmap_pages);
-		} else
-			printk(KERN_WARNING
-				"  %s zone: %lu pages exceeds realsize %lu\n",
-				zone_names[j], memmap_pages, realsize);
-
 		/* Account for reserved pages */
 		if (j == 0 && realsize > dma_reserve) {
 			realsize -= dma_reserve;
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ