[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240725035318.471-1-hailong.liu@oppo.com>
Date: Thu, 25 Jul 2024 11:53:15 +0800
From: <hailong.liu@...o.com>
To: Andrew Morton <akpm@...ux-foundation.org>, Uladzislau Rezki
<urezki@...il.com>, Christoph Hellwig <hch@...radead.org>, Lorenzo Stoakes
<lstoakes@...il.com>, Vlastimil Babka <vbabka@...e.cz>, Michal Hocko
<mhocko@...e.com>
CC: Hailong.Liu <hailong.liu@...o.com>, Barry Song <21cnbao@...il.com>,
Baoquan He <bhe@...hat.com>, Matthew Wilcox <willy@...radead.org>, "Tangquan
. Zheng" <zhengtangquan@...o.com>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>
Subject: [RFC PATCH v2] mm/vmalloc: fix incorrect __vmap_pages_range_noflush() if vm_area_alloc_pages() from high order fallback to order0
From: "Hailong.Liu" <hailong.liu@...o.com>
The scenario where the issue occurs is as follows:
CONFIG: vmap_allow_huge = true && 2M is for PMD_SIZE
kvmalloc(2M, __GFP_NOFAIL|GFP_XXX)
__vmalloc_node_range(vm_flags=VM_ALLOW_HUGE_VMAP)
vm_area_alloc_pages(order=9) --->allocs order9 failed and fallback to order0
and phys_addr is aligned with PMD_SIZE
vmap_pages_range
vmap_pages_range_noflush
__vmap_pages_range_noflush(page_shift = 21) ----> incorrect vmap *huge* here
In fact, as long as page_shift is not equal to PAGE_SHIFT, there
might be issues with the __vmap_pages_range_noflush().
The patch also remove VM_ALLOW_HUGE_VMAP in kvmalloc_node(), There
are several reasons for this:
- This increases memory footprint because ALIGNMENT.
- This increases the likelihood of kvmalloc allocation failures.
- Without this it fixes the origin issue of kvmalloc with __GFP_NOFAIL may return NULL.
Besides if drivers want to vmap huge, user vmalloc_huge instead.
Fix it by disabling fallback and remove VM_ALLOW_HUGE_VMAP in
kvmalloc_node().
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")
CC: Barry Song <21cnbao@...il.com>
CC: Baoquan He <bhe@...hat.com>
CC: Matthew Wilcox <willy@...radead.org>
Reported-by: Tangquan.Zheng <zhengtangquan@...o.com>
Signed-off-by: Hailong.Liu <hailong.liu@...o.com>
---
mm/util.c | 2 +-
mm/vmalloc.c | 9 ---------
2 files changed, 1 insertion(+), 10 deletions(-)
diff --git a/mm/util.c b/mm/util.c
index 669397235787..b23133b738cf 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -657,7 +657,7 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
* protection games.
*/
return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
- flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP,
+ flags, PAGE_KERNEL, 0,
node, __builtin_return_address(0));
}
EXPORT_SYMBOL(kvmalloc_node);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 03c78fae06f3..1914768f473e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3577,15 +3577,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
page = alloc_pages(alloc_gfp, order);
else
page = alloc_pages_node(nid, alloc_gfp, order);
- if (unlikely(!page)) {
- if (!nofail)
- break;
-
- /* fall back to the zero order allocations */
- alloc_gfp |= __GFP_NOFAIL;
- order = 0;
- continue;
- }
/*
* Higher order allocations must be able to be treated as
--
After 1) I check the code and I can't find a resonable band-aid to fix
this. so the v2 patch works but ugly. Glad to hear a better solution :)
[1] https://lore.kernel.org/lkml/20240724182827.nlgdckimtg2gwns5@oppo.com/
2.34.1
Powered by blists - more mailing lists