linux-kernel - Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <31a75f95-6e6e-b640-2d95-08a95ea8cf51@intel.com>
Date:   Tue, 28 Dec 2021 08:14:56 -0800
From:   Dave Hansen <dave.hansen@...el.com>
To:     Kefeng Wang <wangkefeng.wang@...wei.com>,
        Jonathan Corbet <corbet@....net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linuxppc-dev@...ts.ozlabs.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
        linux-arm-kernel@...ts.infradead.org
Cc:     Nicholas Piggin <npiggin@...il.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings

On 12/28/21 2:26 AM, Kefeng Wang wrote:
>>> There are some disadvantages about this feature[2], one of the main
>>> concerns is the possible memory fragmentation/waste in some scenarios,
>>> also archs must ensure that any arch specific vmalloc allocations that
>>> require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX)
>>> use the VM_NO_HUGE_VMAP flag to inhibit larger mappings.
>> That just says that x86 *needs* PAGE_SIZE allocations.  But, what
>> happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)?  Will the
>> subsequent permission changes just fragment the 2M mapping?
> 
> Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping.
> 
> When module alloc with STRICT_MODULE_RWX on x86, it calls
> __change_page_attr()
> 
> from set_memory_ro/rw/nx which will split large page, so there is no
> need to make
> 
> module alloc with HUGE_VMALLOC.

This all sounds very fragile to me.  Every time a new architecture would
get added for huge vmalloc() support, the developer needs to know to go
find that architecture's module_alloc() and add this flag.  They next
guy is going to forget, just like you did.

Considering that this is not a hot path, a weak function would be a nice
choice:

/* vmalloc() flags used for all module allocations. */
unsigned long __weak arch_module_vm_flags()
{
	/*
	 * Modules use a single, large vmalloc().  Different
	 * permissions are applied later and will fragment
	 * huge mappings.  Avoid using huge pages for modules.
	 */
	return VM_NO_HUGE_VMAP;
}

Stick that in some the common module code, next to:

> void * __weak module_alloc(unsigned long size)
> {
>         return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
...

Then, put arch_module_vm_flags() in *all* of the module_alloc()
implementations, including the generic one.  That way (even with a new
architecture) whoever copies-and-pastes their module_alloc()
implementation is likely to get it right.  The next guy who just does a
"select HAVE_ARCH_HUGE_VMALLOC" will hopefully just work.

VM_FLUSH_RESET_PERMS could probably be dealt with in the same way.