[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dbqqojpvqodfxavt4fxugoj3a2ppk5b4b3sp77qsmbg33sc2em@fhjccbxaihrh>
Date: Wed, 28 Aug 2024 14:31:42 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Charlie Jenkins <charlie@...osinc.com>
Cc: Arnd Bergmann <arnd@...db.de>, Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Naveen N Rao <naveen@...nel.org>, Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Huacai Chen <chenhuacai@...nel.org>,
WANG Xuerui <kernel@...0n.name>, Russell King <linux@...linux.org.uk>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
Helge Deller <deller@....de>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
"David S. Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>, Shuah Khan <shuah@...nel.org>,
Alexandre Ghiti <alexghiti@...osinc.com>, linux-arch@...r.kernel.org,
linux-kernel@...r.kernel.org, Palmer Dabbelt <palmer@...osinc.com>,
linux-riscv@...ts.infradead.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-mm@...ck.org,
loongarch@...ts.linux.dev, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org, linux-s390@...r.kernel.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH 00/16] mm: Introduce MAP_BELOW_HINT
* Charlie Jenkins <charlie@...osinc.com> [240828 01:49]:
> Some applications rely on placing data in free bits addresses allocated
> by mmap. Various architectures (eg. x86, arm64, powerpc) restrict the
> address returned by mmap to be less than the maximum address space,
> unless the hint address is greater than this value.
Wait, what arch(s) allows for greater than the max? The passed hint
should be where we start searching, but we go to the lower limit then
start at the hint and search up (or vice-versa on the directions).
I don't understand how unmapping works on a higher address; we would
fail to free it on termination of the application.
Also, there are archs that map outside of the VMAs, which are freed by
freeing from the prev->vm_end to next->vm_start, so I don't understand
what that looks like in this reality as well.
>
> On arm64 this barrier is at 52 bits and on x86 it is at 56 bits. This
> flag allows applications a way to specify exactly how many bits they
> want to be left unused by mmap. This eliminates the need for
> applications to know the page table hierarchy of the system to be able
> to reason which addresses mmap will be allowed to return.
But, why do they need to know today? We have a limit for this don't we?
Also, these upper limits are how some archs use the upper bits that you
are trying to use.
>
> ---
> riscv made this feature of mmap returning addresses less than the hint
> address the default behavior. This was in contrast to the implementation
> of x86/arm64 that have a single boundary at the 5-level page table
> region. However this restriction proved too great -- the reduced
> address space when using a hint address was too small.
Yes, the hint is used to group things close together so it would
literally be random chance on if you have enough room or not (aslr and
all).
>
> A patch for riscv [1] reverts the behavior that broke userspace. This
> series serves to make this feature available to all architectures.
I don't fully understand this statement, you say it broke userspace so
now you are porting it to everyone? This reads as if you are braking
the userspace on all architectures :)
If you fail to find room below, then your application fails as there is
no way to get the upper bits you need. It would be better to fix this
in userspace - if your application is returned too high an address, then
free it and exit because it's going to fail anyways.
>
> I have only tested on riscv and x86.
This should be an RFC then.
> There is a tremendous amount of
> duplicated code in mmap so the implementations across architectures I
> believe should be mostly consistent. I added this feature to all
> architectures that implement either
> arch_get_mmap_end()/arch_get_mmap_base() or
> arch_get_unmapped_area_topdown()/arch_get_unmapped_area(). I also added
> it to the default behavior for arch_get_mmap_end()/arch_get_mmap_base().
Way too much duplicate code. We should be figuring out how to make this
all work with the same code.
This is going to make the cloned code problem worse.
>
> Link: https://lore.kernel.org/lkml/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/T/ [1]
>
> To: Arnd Bergmann <arnd@...db.de>
> To: Paul Walmsley <paul.walmsley@...ive.com>
> To: Palmer Dabbelt <palmer@...belt.com>
> To: Albert Ou <aou@...s.berkeley.edu>
> To: Catalin Marinas <catalin.marinas@....com>
> To: Will Deacon <will@...nel.org>
> To: Michael Ellerman <mpe@...erman.id.au>
> To: Nicholas Piggin <npiggin@...il.com>
> To: Christophe Leroy <christophe.leroy@...roup.eu>
> To: Naveen N Rao <naveen@...nel.org>
> To: Muchun Song <muchun.song@...ux.dev>
> To: Andrew Morton <akpm@...ux-foundation.org>
> To: Liam R. Howlett <Liam.Howlett@...cle.com>
> To: Vlastimil Babka <vbabka@...e.cz>
> To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> To: Thomas Gleixner <tglx@...utronix.de>
> To: Ingo Molnar <mingo@...hat.com>
> To: Borislav Petkov <bp@...en8.de>
> To: Dave Hansen <dave.hansen@...ux.intel.com>
> To: x86@...nel.org
> To: H. Peter Anvin <hpa@...or.com>
> To: Huacai Chen <chenhuacai@...nel.org>
> To: WANG Xuerui <kernel@...0n.name>
> To: Russell King <linux@...linux.org.uk>
> To: Thomas Bogendoerfer <tsbogend@...ha.franken.de>
> To: James E.J. Bottomley <James.Bottomley@...senPartnership.com>
> To: Helge Deller <deller@....de>
> To: Alexander Gordeev <agordeev@...ux.ibm.com>
> To: Gerald Schaefer <gerald.schaefer@...ux.ibm.com>
> To: Heiko Carstens <hca@...ux.ibm.com>
> To: Vasily Gorbik <gor@...ux.ibm.com>
> To: Christian Borntraeger <borntraeger@...ux.ibm.com>
> To: Sven Schnelle <svens@...ux.ibm.com>
> To: Yoshinori Sato <ysato@...rs.sourceforge.jp>
> To: Rich Felker <dalias@...c.org>
> To: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
> To: David S. Miller <davem@...emloft.net>
> To: Andreas Larsson <andreas@...sler.com>
> To: Shuah Khan <shuah@...nel.org>
> To: Alexandre Ghiti <alexghiti@...osinc.com>
> Cc: linux-arch@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> Cc: Palmer Dabbelt <palmer@...osinc.com>
> Cc: linux-riscv@...ts.infradead.org
> Cc: linux-arm-kernel@...ts.infradead.org
> Cc: linuxppc-dev@...ts.ozlabs.org
> Cc: linux-mm@...ck.org
> Cc: loongarch@...ts.linux.dev
> Cc: linux-mips@...r.kernel.org
> Cc: linux-parisc@...r.kernel.org
> Cc: linux-s390@...r.kernel.org
> Cc: linux-sh@...r.kernel.org
> Cc: sparclinux@...r.kernel.org
> Cc: linux-kselftest@...r.kernel.org
> Signed-off-by: Charlie Jenkins <charlie@...osinc.com>
>
> ---
> Charlie Jenkins (16):
> mm: Add MAP_BELOW_HINT
> riscv: mm: Do not restrict mmap address based on hint
> mm: Add flag and len param to arch_get_mmap_base()
> mm: Add generic MAP_BELOW_HINT
> riscv: mm: Support MAP_BELOW_HINT
> arm64: mm: Support MAP_BELOW_HINT
> powerpc: mm: Support MAP_BELOW_HINT
> x86: mm: Support MAP_BELOW_HINT
> loongarch: mm: Support MAP_BELOW_HINT
> arm: mm: Support MAP_BELOW_HINT
> mips: mm: Support MAP_BELOW_HINT
> parisc: mm: Support MAP_BELOW_HINT
> s390: mm: Support MAP_BELOW_HINT
> sh: mm: Support MAP_BELOW_HINT
> sparc: mm: Support MAP_BELOW_HINT
> selftests/mm: Create MAP_BELOW_HINT test
>
> arch/arm/mm/mmap.c | 10 ++++++++
> arch/arm64/include/asm/processor.h | 34 ++++++++++++++++++++++----
> arch/loongarch/mm/mmap.c | 11 +++++++++
> arch/mips/mm/mmap.c | 9 +++++++
> arch/parisc/include/uapi/asm/mman.h | 1 +
> arch/parisc/kernel/sys_parisc.c | 9 +++++++
> arch/powerpc/include/asm/task_size_64.h | 36 +++++++++++++++++++++++-----
> arch/riscv/include/asm/processor.h | 32 -------------------------
> arch/s390/mm/mmap.c | 10 ++++++++
> arch/sh/mm/mmap.c | 10 ++++++++
> arch/sparc/kernel/sys_sparc_64.c | 8 +++++++
> arch/x86/kernel/sys_x86_64.c | 25 ++++++++++++++++---
> fs/hugetlbfs/inode.c | 2 +-
> include/linux/sched/mm.h | 34 ++++++++++++++++++++++++--
> include/uapi/asm-generic/mman-common.h | 1 +
> mm/mmap.c | 2 +-
> tools/arch/parisc/include/uapi/asm/mman.h | 1 +
> tools/include/uapi/asm-generic/mman-common.h | 1 +
> tools/testing/selftests/mm/Makefile | 1 +
> tools/testing/selftests/mm/map_below_hint.c | 29 ++++++++++++++++++++++
> 20 files changed, 216 insertions(+), 50 deletions(-)
> ---
> base-commit: 5be63fc19fcaa4c236b307420483578a56986a37
> change-id: 20240827-patches-below_hint_mmap-b13d79ae1c55
> --
> - Charlie
>
Powered by blists - more mailing lists