[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1650584815.0dtcbd4qky.astroid@bobo.none>
Date: Fri, 22 Apr 2022 10:12:16 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"ast@...nel.org" <ast@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"dborkman@...hat.com" <dborkman@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"hch@...radead.org" <hch@...radead.org>,
"hpa@...or.com" <hpa@...or.com>,
"imbrenda@...ux.ibm.com" <imbrenda@...ux.ibm.com>,
"Kernel-team@...com" <Kernel-team@...com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"mbenes@...e.cz" <mbenes@...e.cz>,
"mcgrof@...nel.org" <mcgrof@...nel.org>,
"pmladek@...e.com" <pmladek@...e.com>,
"rppt@...nel.org" <rppt@...nel.org>,
"song@...nel.org" <song@...nel.org>,
"songliubraving@...com" <songliubraving@...com>
Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP
Excerpts from Linus Torvalds's message of April 22, 2022 2:15 am:
> On Thu, Apr 21, 2022 at 8:47 AM Edgecombe, Rick P
> <rick.p.edgecombe@...el.com> wrote:
>>
>> I wonder if it
>> might have to do with the vmalloc huge pages using compound pages, then
>> some caller doing vmalloc_to_page() and getting surprised with what
>> they could get away with in the struct page.
>
> Very likely. We have 100+ users of vmalloc_to_page() in random
> drivers, and the gpu code does show up on that list.
>
> And is very much another case of "it's always been broken, but
> enabling it on x86 made the breakage actually show up in real life".
Okay that looks like a valid breakage. *Possibly* fb_deferred_io_fault()
using pages vmalloced to screen_buffer? Or a couple of the gpu drivers
are playing with page->mapping as well, not sure if they're vmalloced.
But the fix is this (untested at the moment). It's not some fundamental
reason why any driver should care about allocation size, it's a simple
bug in my code that missed that case. The whole point of the design is
that it's transparent to callers!
Thanks,
Nick
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e163372d3967..70933f4ed069 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2925,12 +2925,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
if (nr != nr_pages_request)
break;
}
- } else
- /*
- * Compound pages required for remap_vmalloc_page if
- * high-order pages.
- */
- gfp |= __GFP_COMP;
+ }
/* High-order pages or fallback path if "bulk" fails. */
@@ -2944,6 +2939,13 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
page = alloc_pages_node(nid, gfp, order);
if (unlikely(!page))
break;
+ /*
+ * Higher order allocations must be able to be treated as
+ * indepdenent small pages by callers (as they can with
+ * small page allocs).
+ */
+ if (order)
+ split_page(page, order);
/*
* Careful, we allocate and map page-order pages, but
Powered by blists - more mailing lists