lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1435193121-25880-11-git-send-email-iamjoonsoo.kim@lge.com>
Date:	Thu, 25 Jun 2015 09:45:21 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Minchan Kim <minchan@...nel.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: [RFC PATCH 10/10] mm/compaction: new threshold for compaction depleted zone

Now, compaction algorithm become powerful. Migration scanner traverses
whole zone range. So, old threshold for depleted zone which is designed
to imitate compaction deferring approach isn't appropriate for current
compaction algorithm. If we adhere to current threshold, 1, we can't
avoid excessive overhead caused by compaction, because one compaction
for low order allocation would be easily successful in any situation.

This patch re-implements threshold calculation based on zone size and
allocation requested order. We judge whther compaction possibility is
depleted or not by number of successful compaction. Roughly, 1/100
of future scanned area should be allocated for high order page during
one comaction iteration in order to determine whether zone's compaction
possiblity is depleted or not.

Below is test result with following setup.

Memory is artificially fragmented to make order 3 allocation hard. And,
most of pageblocks are changed to unmovable migratetype.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and 200 MB is
  	occupied by memory hogger. Most pageblocks are unmovable
  	migratetype.
  Fragmentation: Successful order 3 allocation candidates may be around
  	1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
  	with GFP_NORETRY. This value is determined to saturate allocation
  	success.

Test: hogger-frag-unmovable
                                  redesign  threshold
compact_free_scanned               6441095    2235764
compact_isolated                   2711081     647701
compact_migrate_scanned            4175464    1697292
compact_stall                         2059       2092
compact_success                        207        210
pgmigrate_success                  1348113     318395
Success:                                44         40
Success(N):                             90         83

This change results in greatly decreasing compaction overhead when
zone's compaction possibility is nearly depleted. But, I should admit
that it's not perfect because compaction success rate is decreased.
More precise tuning threshold would restore this regression, but,
it highly depends on workload so I'm not doing it here.

Other test doesn't show any regression.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and kernel
  	build is running on background. Most pageblocks are movable
  	migratetype.
  Fragmentation: Successful order 3 allocation candidates may be around
  	1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
  	with GFP_NORETRY. This value is determined to saturate allocation
  	success.

Test: build-frag-movable
                                  redesign  threshold
compact_free_scanned               2359553    1461131
compact_isolated                    907515     387373
compact_migrate_scanned            3785605    2177090
compact_stall                         2195       2157
compact_success                        247        225
pgmigrate_success                   439739     182366
Success:                                43         43
Success(N):                             89         90

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@....com>
---
 mm/compaction.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 99f533f..63702b3 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -129,19 +129,24 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
 
 /* Do not skip compaction more than 64 times */
 #define COMPACT_MAX_FAILED 4
-#define COMPACT_MIN_DEPLETE_THRESHOLD 1UL
+#define COMPACT_MIN_DEPLETE_THRESHOLD 4UL
 #define COMPACT_MIN_SCAN_LIMIT (pageblock_nr_pages)
 
 static bool compaction_depleted(struct zone *zone)
 {
-	unsigned long threshold;
+	unsigned long nr_possible;
 	unsigned long success = zone->compact_success;
+	unsigned long threshold;
 
-	/*
-	 * Now, to imitate current compaction deferring approach,
-	 * choose threshold to 1. It will be changed in the future.
-	 */
-	threshold = COMPACT_MIN_DEPLETE_THRESHOLD;
+	nr_possible = zone->managed_pages >> zone->compact_order_failed;
+
+	/* Migration scanner can scans more than 1/4 range of zone */
+	nr_possible >>= 2;
+
+	/* We hope to succeed more than 1/100 roughly */
+	threshold = nr_possible >> 7;
+
+	threshold = max(threshold, COMPACT_MIN_DEPLETE_THRESHOLD);
 	if (success >= threshold)
 		return false;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ