linux-kernel - Re: [PATCH 00/13] mm: jit/text allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZH20XkD74prrdN4u@FVFF77S0Q05N>
Date:   Mon, 5 Jun 2023 11:09:34 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Mike Rapoport <rppt@...nel.org>
Cc:     Kent Overstreet <kent.overstreet@...ux.dev>,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        "David S. Miller" <davem@...emloft.net>,
        Dinh Nguyen <dinguyen@...nel.org>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Helge Deller <deller@....de>,
        Huacai Chen <chenhuacai@...nel.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        "Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Russell King <linux@...linux.org.uk>,
        Song Liu <song@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>, bpf@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-mips@...r.kernel.org,
        linux-mm@...ck.org, linux-modules@...r.kernel.org,
        linux-parisc@...r.kernel.org, linux-riscv@...ts.infradead.org,
        linux-s390@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, loongarch@...ts.linux.dev,
        netdev@...r.kernel.org, sparclinux@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH 00/13] mm: jit/text allocator

On Mon, Jun 05, 2023 at 12:20:40PM +0300, Mike Rapoport wrote:
> On Fri, Jun 02, 2023 at 10:35:09AM +0100, Mark Rutland wrote:
> > On Thu, Jun 01, 2023 at 02:14:56PM -0400, Kent Overstreet wrote:
> > > On Thu, Jun 01, 2023 at 05:12:03PM +0100, Mark Rutland wrote:
> > > > For a while I have wanted to give kprobes its own allocator so that it can work
> > > > even with CONFIG_MODULES=n, and so that it doesn't have to waste VA space in
> > > > the modules area.
> > > > 
> > > > Given that, I think these should have their own allocator functions that can be
> > > > provided independently, even if those happen to use common infrastructure.
> > > 
> > > How much memory can kprobes conceivably use? I think we also want to try
> > > to push back on combinatorial new allocators, if we can.
> > 
> > That depends on who's using it, and how (e.g. via BPF).
> > 
> > To be clear, I'm not necessarily asking for entirely different allocators, but
> > I do thinkg that we want wrappers that can at least pass distinct start+end
> > parameters to a common allocator, and for arm64's modules code I'd expect that
> > we'd keep the range falblack logic out of the common allcoator, and just call
> > it twice.
> > 
> > > > > Several architectures override module_alloc() because of various
> > > > > constraints where the executable memory can be located and this causes
> > > > > additional obstacles for improvements of code allocation.
> > > > > 
> > > > > This set splits code allocation from modules by introducing
> > > > > jit_text_alloc(), jit_data_alloc() and jit_free() APIs, replaces call
> > > > > sites of module_alloc() and module_memfree() with the new APIs and
> > > > > implements core text and related allocation in a central place.
> > > > > 
> > > > > Instead of architecture specific overrides for module_alloc(), the
> > > > > architectures that require non-default behaviour for text allocation must
> > > > > fill jit_alloc_params structure and implement jit_alloc_arch_params() that
> > > > > returns a pointer to that structure. If an architecture does not implement
> > > > > jit_alloc_arch_params(), the defaults compatible with the current
> > > > > modules::module_alloc() are used.
> > > > 
> > > > As above, I suspect that each of the callsites should probably be using common
> > > > infrastructure, but I don't think that a single jit_alloc_arch_params() makes
> > > > sense, since the parameters for each case may need to be distinct.
> > > 
> > > I don't see how that follows. The whole point of function parameters is
> > > that they may be different :)
> > 
> > What I mean is that jit_alloc_arch_params() tries to aggregate common
> > parameters, but they aren't actually common (e.g. the actual start+end range
> > for allocation).
> 
> jit_alloc_arch_params() tries to aggregate architecture constraints and
> requirements for allocations of executable memory and this exactly what
> the first 6 patches of this set do.
> 
> A while ago Thomas suggested to use a structure that parametrizes
> architecture constraints by the memory type used in modules [1] and Song
> implemented the infrastructure for it and x86 part [2].
> 
> I liked the idea of defining parameters in a single structure, but I
> thought that approaching the problem from the arch side rather than from
> modules perspective will be better starting point, hence these patches.
> 
> I don't see a fundamental reason why a single structure cannot describe
> what is needed for different code allocation cases, be it modules, kprobes
> or bpf. There is of course an assumption that the core allocations will be
> the same for all the users, and it seems to me that something like 
> 
> * allocate physical memory if allocator caches are empty
> * map it in vmalloc or modules address space
> * return memory from the allocator cache to the caller
> 
> will work for all usecases.
> 
> We might need separate caches for different cases on different
> architectures, and a way to specify what cache should be used in the
> allocator API, but that does not contradict a single structure for arch
> specific parameters, but only makes it more elaborate, e.g. something like
> 
> enum jit_type {
> 	JIT_MODULES_TEXT,
> 	JIT_MODULES_DATA,
> 	JIT_KPROBES,
> 	JIT_FTRACE,
> 	JIT_BPF,
> 	JIT_TYPE_MAX,
> };
> 
> struct jit_alloc_params {
> 	struct jit_range	ranges[JIT_TYPE_MAX];
> 	/* ... */
> };
> 
> > > Can you give more detail on what parameters you need? If the only extra
> > > parameter is just "does this allocation need to live close to kernel
> > > text", that's not that big of a deal.
> > 
> > My thinking was that we at least need the start + end for each caller. That
> > might be it, tbh.
> 
> Do you mean that modules will have something like
> 
> 	jit_text_alloc(size, MODULES_START, MODULES_END);
> 
> and kprobes will have
> 
> 	jit_text_alloc(size, KPROBES_START, KPROBES_END);
> ?

Yes.

> It sill can be achieved with a single jit_alloc_arch_params(), just by
> adding enum jit_type parameter to jit_text_alloc().

That feels backwards to me; it centralizes a bunch of information about
distinct users to be able to shove that into a static array, when the callsites
can pass that information. 

What's *actually* common after separating out the ranges? Is it just the
permissions?

If we want this to be able to share allocations and so on, why can't we do this
like a kmem_cache, and have the callsite pass a pointer to the allocator data?
That would make it easy for callsites to share an allocator or use a distinct
one.

Thanks,
Mark.

> [1] https://lore.kernel.org/linux-mm/87v8mndy3y.ffs@tglx/ 
> [2] https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org
> 
> > Thanks,
> > Mark.
> 
> -- 
> Sincerely yours,
> Mike.