[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1111151554190.3781@chino.kir.corp.google.com>
Date: Tue, 15 Nov 2011 16:07:08 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Mel Gorman <mgorman@...e.de>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Minchan Kim <minchan.kim@...il.com>, Jan Kara <jack@...e.cz>,
Andy Isaacson <adi@...apodia.org>,
Johannes Weiner <jweiner@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm: Do not stall in synchronous compaction for THP
allocations
On Tue, 15 Nov 2011, Mel Gorman wrote:
> Adding sync here could obviously be implemented although it may
> require both always-sync and madvise-sync. Alternatively, something
> like an options file could be created to create a bitmap similar to
> what ftrace does. Whatever the mechanism, it exposes the fact that
> "sync compaction" is used. If that turns out to be not enough, then
> you may want to add other steps like aggressively reclaiming memory
> which also potentially may need to be controlled via the sysfs file
> and this is the slippery slope.
>
So what's being proposed here in this patch is the fifth time this line
has been changed and its always been switched between true and !(gfp_mask
& __GFP_NO_KSWAPD). Instead of changing it every few months, I'd suggest
that we tie the semantics of the tunable directly to sync_compaction since
we're primarily targeting thp hugepages with this change anyway for the
"always" case. Comments?
diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -116,6 +116,13 @@ echo always >/sys/kernel/mm/transparent_hugepage/defrag
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo never >/sys/kernel/mm/transparent_hugepage/defrag
+If defrag is set to "always", then all hugepage allocations also attempt
+synchronous memory compaction which makes the allocation as aggressive
+as possible. The overhead of attempting to allocate the hugepage is
+considered acceptable because of the longterm benefits of the hugepage
+itself at runtime. If the VM should fallback to using regular pages
+instead, then you should use "madvise" or "never".
+
khugepaged will be automatically started when
transparent_hugepage/enabled is set to "always" or "madvise, and it'll
be automatically shutdown if it's set to "never".
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2168,7 +2168,17 @@ rebalance:
sync_migration);
if (page)
goto got_pg;
- sync_migration = true;
+
+ /*
+ * Do not use synchronous migration for transparent hugepages unless
+ * defragmentation is always attempted for such allocations since it
+ * can stall in writeback, which is far worse than simply failing to
+ * promote a page. Otherwise, we really do want a hugepage and are as
+ * aggressive as possible to allocate it.
+ */
+ sync_migration = !(gfp_mask & __GFP_NO_KSWAPD) ||
+ (transparent_hugepage_flags &
+ (1 << TRANSPARENT_HUGEPAGE_DEFRAG_FLAG));
/* Try direct reclaim and then allocating */
page = __alloc_pages_direct_reclaim(gfp_mask, order,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists