[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0209B426-E425-44C2-825C-8AAC59B5BB2D@fb.com>
Date: Wed, 12 Oct 2022 19:01:44 +0000
From: Song Liu <songliubraving@...a.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Luis Chamberlain <mcgrof@...nel.org>
CC: Song Liu <songliubraving@...a.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
Kernel Team <Kernel-team@...com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"song@...nel.org" <song@...nel.org>, "hch@....de" <hch@....de>,
"x86@...nel.org" <x86@...nel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"Hansen, Dave" <dave.hansen@...el.com>,
"urezki@...il.com" <urezki@...il.com>
Subject: Re: [RFC v2 4/4] vmalloc_exec: share a huge page with kernel text
> On Oct 12, 2022, at 11:38 AM, Edgecombe, Rick P <rick.p.edgecombe@...el.com> wrote:
>
> On Wed, 2022-10-12 at 05:37 +0000, Song Liu wrote:
>>> Then you have code that operates on module text like:
>>> if (is_vmalloc_or_module_addr(addr))
>>> pfn = vmalloc_to_pfn(addr);
>>>
>>> It looks like it would work (on x86 at least). Should it be
>>> expected
>>> to?
>>>
>>> Especially after this patch, where there is memory that isn't even
>>> tracked by the original vmap_area trees, it is pretty much a
>>> separate
>>> allocator. So I think it might be nice to spell out which other
>>> vmalloc
>>> APIs work with these new functions since they are named "vmalloc".
>>> Maybe just say none of them do.
>>
>> I guess it is fair to call this a separate allocator. Maybe
>> vmalloc_exec is not the right name? I do think this is the best
>> way to build an allocator with vmap tree logic.
>
> Yea, I don't know about the name. I think someone else suggested it
> specifically, right?
I think Luis suggested rename module_alloc to vmalloc_exec. But I
guess we still need module_alloc for module data allocations.
>
> I had called mine perm_alloc() so it could also handle read-only and
> other permissions.
What are other permissions that we use? We can probably duplicate
the free_text_are_ tree logic for other cases.
> If you keep vmalloc_exec() it needs some big
> comments about which APIs can work with it, and an audit of the
> existing code that works on module and JIT text.
>
>>
>>>
>>>
>>> Separate from that, I guess you are planning to make this limited
>>> to
>>> certain architectures? It might be better to put logic with
>>> assumptions
>>> about x86 boot time page table details inside arch/x86 somewhere.
>>
>> Yes, the architecture need some text_poke mechanism to use this.
>
> It also depends on the space between _etext and the PMD aligned _etext
> to be present and not get used by anything else. For other
> architectures, there might be rodata there or other things.
Good point! We need to make sure this part is not used by other things.
>
>> On BPF side, x86_64 calls this directly from arch code (jit engine),
>> so it is mostly covered. For modules, we need to handle this better.
>
> That old RFC has some ideas around this. I kind of like your
> incremental approach though. To me it seems to be moving in the right
> direction.
Thanks!
Song
Powered by blists - more mailing lists