lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZHF21CSxyjQy9C7F@moria.home.lan>
Date:   Fri, 26 May 2023 23:19:48 -0400
From:   Kent Overstreet <kent.overstreet@...ux.dev>
To:     Song Liu <song@...nel.org>
Cc:     linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
        mcgrof@...nel.org, peterz@...radead.org, tglx@...utronix.de,
        x86@...nel.org, rppt@...nel.org
Subject: Re: [PATCH 1/3] module: Introduce module_alloc_type

On Fri, May 26, 2023 at 05:03:18PM -0700, Song Liu wrote:
> On Fri, May 26, 2023 at 4:39 PM Kent Overstreet
> <kent.overstreet@...ux.dev> wrote:
> >
> [...]
> 
> >
> > But it should be an internal implementation detail, I don't think we
> > want the external interface tied to vmalloc -
> >
> > > These two APIs allow the core code work with all archs. They won't break
> > > sub-page allocations. (not all archs will have sub-page allocations)
> >
> > So yes, text_poke() doesn't work on all architectures, and we need a
> > fallback.
> >
> > But if the fallback needs to go the unprotect/protect route, I think we
> > need to put that in the helper, and not expose it. Part of what people
> > are wanting is to limit or eliminate pages that are RWX, so we
> > definitely shouldn't be creating new interfaces to flipping page
> > permissions: that should be pushed down to as low a level as possible.
> >
> > E.g., with my jitalloc_update(), a more complete version would look like
> >
> > void jitalloc_update(void *dst, void *src, size_t len)
> > {
> >         if (text_poke_available) {
> >                 text_poke(...);
> >         } else {
> >                 unprotect();
> >                 memcpy();
> >                 protect();
> >         }
> > }
> 
> I think this doesn't work for sub page allocation?

Perhaps I elided too much - it does if you add a single lock. You can't
do that if it's not a common helper.

> At the end of all this, we will have modules running from huge pages, data
> and text. It will give significant performance benefit when some key driver
> cannot be compiled into the kernel.

Yeah, I've seen the numbers for the perf impact of running as a module
due to the extra TLB overhead - but Mike's recent data was showing that
this doesn't matter nearly as much as data as it does for text.

> > Also - module_memory_fill_type(), module_memory_invalidate_type() - I
> > know these are for BPF, but could you explain why we need them in the
> > external interface here? Could they perhaps be small helpers in the bpf
> > code that use something like jitalloc_update()?
> 
> These are used by all users, not just BPF. 1/3 uses them in
> kernel/module/main.c. I didn't use them in 3/3 as it is arch code, but I can
> use them instead of text_poke_* (and that is probably better code style).
> As I am writing the email, I think I should use it in ftrace.c (2/3) to avoid
> the __weak function.

Ok. Could we make it clearer why they're needed and what they're for?

I know bpf fills in invalid instructions initially; I didn't read
through enough code to find the "why", and let's document that to save
other people the same effort.

And do they really need to be callbacks in mod_alloc_params...?

Do we need the other things in mod_alloc_params/vmalloc_params?
 - your granularity field says it's for specifying PAGE/PMD size: we
   definitely do not want that. We've had way too much code along the
   lines of "implement hugepages for x", we need to stop doing that.
   It's an internal mm/ detail.

 - vmalloc_params: we need gfp_t and pgprot_t, but those should be
   normal arguments. start/end/vm_flags? They seem like historical
   module baggage, can we get rid of them?

> > > OTOH, jit_alloc (as-is) solves a subset of the problem (only the text).
> > > We can probably use it (with some updates) instead of the vmap_area
> > > based allocator. But I guess we need to align the direction first.
> >
> > Let's see if we can get tglx to chime in again, since he was pretty
> > vocal in the last discussion.
> >
> > It's also good practice to try to summarize all the relevant "whys" from
> > the discussion that went into a feature and put that in at least the
> > commit message - or even better, header file comments.
> >
> > Also, organization: the module code is a huge mess, let's _please_ do
> > split this out and think about organization a bit more, not add to the
> > mess :)
> 
> I don't really think module code is very messy at the moment. If
> necessary, we can put module_alloc_type related code in a
> separate file.

Hey, it's been organized since I last looked at it :) But I tihink this
belongs in mm/, not module code.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ