[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1650596505.bxrmjmgjur.astroid@bobo.none>
Date: Fri, 22 Apr 2022 13:08:03 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"Torvalds, Linus" <torvalds@...ux-foundation.org>
Cc: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"ast@...nel.org" <ast@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"dborkman@...hat.com" <dborkman@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"hch@...radead.org" <hch@...radead.org>,
"hpa@...or.com" <hpa@...or.com>,
"imbrenda@...ux.ibm.com" <imbrenda@...ux.ibm.com>,
"Kernel-team@...com" <Kernel-team@...com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"mbenes@...e.cz" <mbenes@...e.cz>,
"mcgrof@...nel.org" <mcgrof@...nel.org>,
"pmladek@...e.com" <pmladek@...e.com>,
"rppt@...nel.org" <rppt@...nel.org>,
"song@...nel.org" <song@...nel.org>,
"songliubraving@...com" <songliubraving@...com>
Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP
Excerpts from Edgecombe, Rick P's message of April 22, 2022 12:29 pm:
> On Fri, 2022-04-22 at 10:12 +1000, Nicholas Piggin wrote:
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index e163372d3967..70933f4ed069 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -2925,12 +2925,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>> if (nr != nr_pages_request)
>> break;
>> }
>> - } else
>> - /*
>> - * Compound pages required for remap_vmalloc_page if
>> - * high-order pages.
>> - */
>> - gfp |= __GFP_COMP;
>> + }
>>
>> /* High-order pages or fallback path if "bulk" fails. */
>>
>> @@ -2944,6 +2939,13 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
>> page = alloc_pages_node(nid, gfp, order);
>> if (unlikely(!page))
>> break;
>> + /*
>> + * Higher order allocations must be able to be
>> treated as
>> + * indepdenent small pages by callers (as they can
>> with
>> + * small page allocs).
>> + */
>> + if (order)
>> + split_page(page, order);
>>
>> /*
>> * Careful, we allocate and map page-order pages, but
>
> FWIW, I like this direction. I think it needs to free them differently
> though? Since currently assumes they are high order pages in that path.
Yeah I got a bit excited there, but fairly sure that's the bug.
I'll do a proper patch.
> I also wonder if we wouldn't need vm_struct->page_order anymore, and
> all the places that would percolates out to. Basically all the places
> where it iterates through vm_struct->pages with page_order stepping.
>
> Besides fixing the bisected issue (hopefully), it also more cleanly
> separates the mapping from the backing allocation logic. And then since
> all the pages are 4k (from the page allocator perspective), it would be
> easier to support non-huge page aligned sizes. i.e. not use up a whole
> additional 2MB page if you only need 4k more of allocation size.
Somewhat. I'm not sure about freeing pages back for general use when
they have page tables pointing to them. That will be a whole different
kettle of fish I suspect. Honestly I don't think fragmentation has
proven to be that bad of a problem yet that we'd want to do something
like that yet.
But there are other reasons to decople them in general. Some ppc32 I
think had a different page size that was not a huge page which it could
use to map vmalloc memory, for example.
In any case I have no objections to making that part of it more
flexible.
Thanks,
Nick
Powered by blists - more mailing lists