[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ys/FHgonNLo29Bp2@worktop.programming.kicks-ass.net>
Date: Thu, 14 Jul 2022 09:26:22 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: Song Liu <song@...nel.org>, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, mcgrof@...nel.org,
rostedt@...dmis.org, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, mhiramat@...nel.org, naveen.n.rao@...ux.ibm.com,
davem@...emloft.net, anil.s.keshavamurthy@...el.com,
keescook@...omium.org, dave@...olabs.net, daniel@...earbox.net,
kernel-team@...com, x86@...nel.org, dave.hansen@...ux.intel.com,
rick.p.edgecombe@...el.com, akpm@...ux-foundation.org
Subject: Re: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which
allocates RO+X memory
On Wed, Jul 13, 2022 at 10:16:36PM -0700, Christoph Hellwig wrote:
> On Wed, Jul 13, 2022 at 12:20:09PM +0200, Peter Zijlstra wrote:
> > Start by adding VM_TOPDOWN_VMAP, which instead of returning the lowest
> > (leftmost) vmap_area that fits, picks the higests (rightmost).
> >
> > Then add module_alloc_data() that uses VM_TOPDOWN_VMAP and make
> > ARCH_WANTS_MODULE_DATA_IN_VMALLOC use that instead of vmalloc (with a
> > weak function doing the vmalloc).
> >
> > This gets you bottom of module range is RO+X only, top is shattered
> > between different !X types.
> >
> > Then track the boundary between X and !X and ensure module_alloc_data()
> > and module_alloc() never cross over and stay strictly separated.
> >
> > Then change all module_alloc() users to expect RO+X memory, instead of
> > RW.
> >
> > Then make sure any extention of the X range is 2M aligned.
> >
> > And presto, *everybody* always uses 2M TLB for text, modules, bpf,
> > ftrace, the lot and nobody is tracking chunks.
> >
> > Maybe migration can be eased by instead providing module_alloc_text()
> > and ARCH_WANTS_MODULE_ALLOC_TEXT.
>
> This all looks pretty sensible. How are we going to do the initial
> write to the executable memory, though?
With something like text_poke_memcpy(). I suppose that the proposed
ARCH_WANTS_MODULE_ALLOC_TEXT needs to imply availability of that too.
If the 4K copy thing ends up being a bottleneck we can easily extend
that to have a 2M option as well.
Powered by blists - more mailing lists