linux-kernel - Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPhsuW541pcsMKYah=2U8mUs8is3jAiNKC8Erte=RkAUGFO9EA@mail.gmail.com>
Date:   Thu, 18 May 2023 09:33:20 -0700
From:   Song Liu <song@...nel.org>
To:     Mike Rapoport <rppt@...nel.org>
Cc:     Kent Overstreet <kent.overstreet@...ux.dev>, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
        x86@...nel.org
Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc()

On Thu, May 18, 2023 at 8:24 AM Mike Rapoport <rppt@...nel.org> wrote:
>
> On Wed, May 17, 2023 at 11:35:56PM -0400, Kent Overstreet wrote:
> > On Wed, Mar 08, 2023 at 11:41:02AM +0200, Mike Rapoport wrote:
> > > From: "Mike Rapoport (IBM)" <rppt@...nel.org>
> > >
> > > When set_memory or set_direct_map APIs used to change attribute or
> > > permissions for chunks of several pages, the large PMD that maps these
> > > pages in the direct map must be split. Fragmenting the direct map in such
> > > manner causes TLB pressure and, eventually, performance degradation.
> > >
> > > To avoid excessive direct map fragmentation, add ability to allocate
> > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the
> > > allocated pages from the direct map and use a cache of the unmapped pages.
> > >
> > > This cache is replenished with higher order pages with preference for
> > > PMD_SIZE pages when possible so that there will be fewer splits of large
> > > pages in the direct map.
> > >
> > > The cache is implemented as a buddy allocator, so it can serve high order
> > > allocations of unmapped pages.
> >
> > So I'm late to this discussion, I stumbled in because of my own run in
> > with executable memory allocation.
> >
> > I understand that post LSF this patchset seems to not be going anywhere,
> > but OTOH there's also been a desire for better executable memory
> > allocation; as noted by tglx and elsewhere, there _is_ a definite
> > performance impact on page size with kernel text - I've seen numbers in
> > the multiple single digit percentage range in the past.
> >
> > This patchset does seem to me to be roughly the right approach for that,
> > and coupled with the slab allocator for sub-page sized allocations it
> > seems there's the potential for getting a nice interface that spans the
> > full range of allocation sizes, from small bpf/trampoline allocations up
> > to modules.
> >
> > Is this patchset worth reviving/continuing with? Was it really just the
> > needed module refactoring that was the blocker?
>
> As I see it, this patchset only one building block out of three? four?
> If we are to repurpose it for code allocations it should be something like
>
> 1) allocate 2M page to fill the cache
> 2) remove this page from the direct map
> 3) map the 2M page ROX in module address space (usually some part of
>    vmalloc address space)
> 4) allocate a smaller chunk of that page to the actual caller (bpf,
>    modules, whatever)
>
> Right now (3) and (4) won't work for modules because they mix code and data
> in a single allocation.

I am working on patches based on the discussion in [1]. I am planning to
send v1 for review in a week or so.

Thanks,
Song

[1] https://lore.kernel.org/linux-mm/20221107223921.3451913-1-song@kernel.org/

[...]