lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 25 Feb 2015 11:52:28 +0100 From: Vlastimil Babka <vbabka@...e.cz> To: David Rientjes <rientjes@...gle.com>, Andrew Morton <akpm@...ux-foundation.org> CC: Greg Thelen <gthelen@...gle.com>, "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>, Linus Torvalds <torvalds@...ux-foundation.org>, linux-kernel@...r.kernel.org, linux-mm@...ck.org Subject: Re: [patch v2 for-4.0] mm, thp: really limit transparent hugepage allocation to local node On 02/25/2015 12:24 AM, David Rientjes wrote: > From: Greg Thelen <gthelen@...gle.com> > > Commit 077fcf116c8c ("mm/thp: allocate transparent hugepages on local > node") restructured alloc_hugepage_vma() with the intent of only > allocating transparent hugepages locally when there was not an effective > interleave mempolicy. > > alloc_pages_exact_node() does not limit the allocation to the single > node, however, but rather prefers it. This is because __GFP_THISNODE is > not set which would cause the node-local nodemask to be passed. Without > it, only a nodemask that prefers the local node is passed. Oops, good catch. But I believe we have the same problem with khugepaged_alloc_page(), rendering the recent node determination and zone_reclaim strictness patches partially useless. Then I start to wonder about other alloc_pages_exact_node() users. Some do pass __GFP_THISNODE, others not - are they also mistaken? I guess the function is a misnomer - when I see "exact_node", I expect the __GFP_THISNODE behavior. I think to avoid such hidden catches, we should create alloc_pages_preferred_node() variant, change the exact_node() variant to pass __GFP_THISNODE, and audit and adjust all callers accordingly. Also, you pass __GFP_NOWARN but that should be covered by GFP_TRANSHUGE already. Of course, nothing guarantees that hugepage == true implies that gfp == GFP_TRANSHUGE... but current in-tree callers conform to that. > Fix this by passing __GFP_THISNODE and falling back to small pages when > the allocation fails. > > Fixes: 077fcf116c8c ("mm/thp: allocate transparent hugepages on local node") > Signed-off-by: Greg Thelen <gthelen@...gle.com> > Signed-off-by: David Rientjes <rientjes@...gle.com> > --- > v2: GFP_THISNODE actually defers compaction and reclaim entirely based on > the combination of gfp flags. We want to try compaction and reclaim, > so only set __GFP_THISNODE. We still set __GFP_NOWARN to suppress > oom warnings in the kernel log when we can simply fallback to small > pages. > > mm/mempolicy.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -1985,7 +1985,10 @@ retry_cpuset: > nmask = policy_nodemask(gfp, pol); > if (!nmask || node_isset(node, *nmask)) { > mpol_cond_put(pol); > - page = alloc_pages_exact_node(node, gfp, order); > + page = alloc_pages_exact_node(node, gfp | > + __GFP_THISNODE | > + __GFP_NOWARN, > + order); > goto out; > } > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists