lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 17 Apr 2024 16:32:49 -0700
From: Song Liu <song@...nel.org>
To: Mike Rapoport <rppt@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>, Peter Zijlstra <peterz@...radead.org>, 
	linux-kernel@...r.kernel.org, Alexandre Ghiti <alexghiti@...osinc.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Bjorn Topel <bjorn@...nel.org>, 
	Catalin Marinas <catalin.marinas@....com>, Christophe Leroy <christophe.leroy@...roup.eu>, 
	"David S. Miller" <davem@...emloft.net>, Dinh Nguyen <dinguyen@...nel.org>, 
	Donald Dutile <ddutile@...hat.com>, Eric Chanudet <echanude@...hat.com>, 
	Heiko Carstens <hca@...ux.ibm.com>, Helge Deller <deller@....de>, Huacai Chen <chenhuacai@...nel.org>, 
	Kent Overstreet <kent.overstreet@...ux.dev>, Luis Chamberlain <mcgrof@...nel.org>, 
	Michael Ellerman <mpe@...erman.id.au>, Nadav Amit <nadav.amit@...il.com>, 
	Palmer Dabbelt <palmer@...belt.com>, Puranjay Mohan <puranjay12@...il.com>, 
	Rick Edgecombe <rick.p.edgecombe@...el.com>, Russell King <linux@...linux.org.uk>, 
	Steven Rostedt <rostedt@...dmis.org>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>, 
	Thomas Gleixner <tglx@...utronix.de>, Will Deacon <will@...nel.org>, bpf@...r.kernel.org, 
	linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-mips@...r.kernel.org, linux-mm@...ck.org, linux-modules@...r.kernel.org, 
	linux-parisc@...r.kernel.org, linux-riscv@...ts.infradead.org, 
	linux-s390@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
	linuxppc-dev@...ts.ozlabs.org, loongarch@...ts.linux.dev, 
	netdev@...r.kernel.org, sparclinux@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH v4 05/15] mm: introduce execmem_alloc() and execmem_free()

On Tue, Apr 16, 2024 at 12:23 AM Mike Rapoport <rppt@...nel.org> wrote:
>
> On Mon, Apr 15, 2024 at 06:36:39PM +0100, Mark Rutland wrote:
> > On Mon, Apr 15, 2024 at 09:52:41AM +0200, Peter Zijlstra wrote:
> > > On Thu, Apr 11, 2024 at 07:00:41PM +0300, Mike Rapoport wrote:
> > > > +/**
> > > > + * enum execmem_type - types of executable memory ranges
> > > > + *
> > > > + * There are several subsystems that allocate executable memory.
> > > > + * Architectures define different restrictions on placement,
> > > > + * permissions, alignment and other parameters for memory that can be used
> > > > + * by these subsystems.
> > > > + * Types in this enum identify subsystems that allocate executable memory
> > > > + * and let architectures define parameters for ranges suitable for
> > > > + * allocations by each subsystem.
> > > > + *
> > > > + * @EXECMEM_DEFAULT: default parameters that would be used for types that
> > > > + * are not explcitly defined.
> > > > + * @EXECMEM_MODULE_TEXT: parameters for module text sections
> > > > + * @EXECMEM_KPROBES: parameters for kprobes
> > > > + * @EXECMEM_FTRACE: parameters for ftrace
> > > > + * @EXECMEM_BPF: parameters for BPF
> > > > + * @EXECMEM_TYPE_MAX:
> > > > + */
> > > > +enum execmem_type {
> > > > + EXECMEM_DEFAULT,
> > > > + EXECMEM_MODULE_TEXT = EXECMEM_DEFAULT,
> > > > + EXECMEM_KPROBES,
> > > > + EXECMEM_FTRACE,
> > > > + EXECMEM_BPF,
> > > > + EXECMEM_TYPE_MAX,
> > > > +};
> > >
> > > Can we please get a break-down of how all these types are actually
> > > different from one another?
> > >
> > > I'm thinking some platforms have a tiny immediate space (arm64 comes to
> > > mind) and has less strict placement constraints for some of them?
> >
> > Yeah, and really I'd *much* rather deal with that in arch code, as I have said
> > several times.
> >
> > For arm64 we have two bsaic restrictions:
> >
> > 1) Direct branches can go +/-128M
> >    We can expand this range by having direct branches go to PLTs, at a
> >    performance cost.
> >
> > 2) PREL32 relocations can go +/-2G
> >    We cannot expand this further.
> >
> > * We don't need to allocate memory for ftrace. We do not use trampolines.
> >
> > * Kprobes XOL areas don't care about either of those; we don't place any
> >   PC-relative instructions in those. Maybe we want to in future.
> >
> > * Modules care about both; we'd *prefer* to place them within +/-128M of all
> >   other kernel/module code, but if there's no space we can use PLTs and expand
> >   that to +/-2G. Since modules can refreence other modules, that ends up
> >   actually being halved, and modules have to fit within some 2G window that
> >   also covers the kernel.

Is +/- 2G enough for all realistic use cases? If so, I guess we don't
really need
EXECMEM_ANYWHERE below?

> >
> > * I'm not sure about BPF's requirements; it seems happy doing the same as
> >   modules.
>
> BPF are happy with vmalloc().
>
> > So if we *must* use a common execmem allocator, what we'd reall want is our own
> > types, e.g.
> >
> >       EXECMEM_ANYWHERE
> >       EXECMEM_NOPLT
> >       EXECMEM_PREL32
> >
> > ... and then we use those in arch code to implement module_alloc() and friends.
>
> I'm looking at execmem_types more as definition of the consumers, maybe I
> should have named the enum execmem_consumer at the first place.

I think looking at execmem_type from consumers' point of view adds
unnecessary complexity. IIUC, for most (if not all) archs, ftrace, kprobe,
and bpf (and maybe also module text) all have the same requirements.
Did I miss something?

IOW, we have

enum execmem_type {
        EXECMEM_DEFAULT,
        EXECMEM_TEXT,
        EXECMEM_KPROBES = EXECMEM_TEXT,
        EXECMEM_FTRACE = EXECMEM_TEXT,
        EXECMEM_BPF = EXECMEM_TEXT,      /* we may end up without
_KPROBE, _FTRACE, _BPF */
        EXECMEM_DATA,  /* rw */
        EXECMEM_RO_DATA,
        EXECMEM_RO_AFTER_INIT,
        EXECMEM_TYPE_MAX,
};

Does this make sense?

Thanks,
Song

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ