[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150330102802.GQ4701@suse.de>
Date: Mon, 30 Mar 2015 11:28:02 +0100
From: Mel Gorman <mgorman@...e.de>
To: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
David Rientjes <rientjes@...gle.com>,
Rik van Riel <riel@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] mm: hugetlb: add stub-like do_hugetlb_numa()
On Mon, Mar 30, 2015 at 09:40:54AM +0000, Naoya Horiguchi wrote:
> hugetlb doesn't support NUMA balancing now, but that doesn't mean that we
> don't have to make hugetlb code prepared for PROTNONE entry properly.
> In the current kernel, when a process accesses to hugetlb range protected
> with PROTNONE, it causes unexpected COWs, which finally put hugetlb subsystem
> into broken/uncontrollable state, where for example h->resv_huge_pages is
> subtracted too much and wrapped around to a very large number, and free
> hugepage pool is no longer maintainable.
>
Ouch!
> This patch simply clears PROTNONE when it's caught out. Real NUMA balancing
> code for hugetlb is not implemented yet (not sure how much it's worth doing.)
>
It's not worth doing at all. Furthermore, an application that took the
effort to allocate and use hugetlb pages is not going to appreciate the
minor faults incurred by automatic balancing for no gain. Why not something
like the following untested patch? It simply avoids doing protection updates
on hugetlb VMAs. If it works for you, feel free to take it and reuse most
of the same changelog for it. I'll only be intermittently online for the
next few days and would rather not unnecessarily delay a fix.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ce18f3c097a..74bfde50fd4e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
vma = mm->mmap;
}
for (; vma; vma = vma->vm_next) {
- if (!vma_migratable(vma) || !vma_policy_mof(vma))
+ if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
+ is_vm_hugetlb_page(vma)) {
continue;
+ }
/*
* Shared library pages mapped by multiple processes are not
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists