[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d873a994-4efa-4d3a-bdae-5d9a3eff29f2@lucifer.local>
Date: Fri, 13 Sep 2024 08:41:34 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Charlie Jenkins <charlie@...osinc.com>
Cc: Catalin Marinas <catalin.marinas@....com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Arnd Bergmann <arnd@...db.de>, guoren <guoren@...nel.org>,
Richard Henderson <richard.henderson@...aro.org>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>, Vineet Gupta <vgupta@...nel.org>,
Russell King <linux@...linux.org.uk>,
Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
"James E . J . Bottomley" <James.Bottomley@...senpartnership.com>,
Helge Deller <deller@....de>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Naveen N Rao <naveen@...nel.org>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
"David S . Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>, shuah <shuah@...nel.org>,
Christoph Hellwig <hch@...radead.org>, Michal Hocko <mhocko@...e.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Chris Torek <chris.torek@...il.com>,
Linux-Arch <linux-arch@...r.kernel.org>, linux-kernel@...r.kernel.org,
linux-alpha@...r.kernel.org, linux-snps-arc@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org,
"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>,
loongarch@...ts.linux.dev, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org, linux-sh@...r.kernel.org,
sparclinux@...r.kernel.org, linux-mm@...ck.org,
linux-kselftest@...r.kernel.org, linux-abi-devel@...ts.sourceforge.net
Subject: Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to
47 bits
On Wed, Sep 11, 2024 at 11:18:12PM GMT, Charlie Jenkins wrote:
> On Wed, Sep 11, 2024 at 07:21:27PM +0100, Catalin Marinas wrote:
> > On Tue, Sep 10, 2024 at 05:45:07PM -0700, Charlie Jenkins wrote:
> > > On Tue, Sep 10, 2024 at 03:08:14PM -0400, Liam R. Howlett wrote:
> > > > * Catalin Marinas <catalin.marinas@....com> [240906 07:44]:
> > > > > On Fri, Sep 06, 2024 at 09:55:42AM +0000, Arnd Bergmann wrote:
> > > > > > On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote:
> > > > > > > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann <arnd@...db.de> wrote:
> > > > > > >> It's also unclear to me how we want this flag to interact with
> > > > > > >> the existing logic in arch_get_mmap_end(), which attempts to
> > > > > > >> limit the default mapping to a 47-bit address space already.
> > > > > > >
> > > > > > > To optimize RISC-V progress, I recommend:
> > > > > > >
> > > > > > > Step 1: Approve the patch.
> > > > > > > Step 2: Update Go and OpenJDK's RISC-V backend to utilize it.
> > > > > > > Step 3: Wait approximately several iterations for Go & OpenJDK
> > > > > > > Step 4: Remove the 47-bit constraint in arch_get_mmap_end()
> >
> > Point 4 is an ABI change. What guarantees that there isn't still
> > software out there that relies on the old behaviour?
>
> Yeah I don't think it would be desirable to remove the 47 bit
> constraint in architectures that already have it.
>
> >
> > > > > > I really want to first see a plausible explanation about why
> > > > > > RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW
> > > > > > like all the other major architectures (x86, arm64, powerpc64),
> > > > >
> > > > > FWIW arm64 actually limits DEFAULT_MAP_WINDOW to 48-bit in the default
> > > > > configuration. We end up with a 47-bit with 16K pages but for a
> > > > > different reason that has to do with LPA2 support (I doubt we need this
> > > > > for the user mapping but we need to untangle some of the macros there;
> > > > > that's for a separate discussion).
> > > > >
> > > > > That said, we haven't encountered any user space problems with a 48-bit
> > > > > DEFAULT_MAP_WINDOW. So I also think RISC-V should follow a similar
> > > > > approach (47 or 48 bit default limit). Better to have some ABI
> > > > > consistency between architectures. One can still ask for addresses above
> > > > > this default limit via mmap().
> > > >
> > > > I think that is best as well.
> > > >
> > > > Can we please just do what x86 and arm64 does?
> > >
> > > I responded to Arnd in the other thread, but I am still not convinced
> > > that the solution that x86 and arm64 have selected is the best solution.
> > > The solution of defaulting to 47 bits does allow applications the
> > > ability to get addresses that are below 47 bits. However, due to
> > > differences across architectures it doesn't seem possible to have all
> > > architectures default to the same value. Additionally, this flag will be
> > > able to help users avoid potential bugs where a hint address is passed
> > > that causes upper bits of a VA to be used.
> >
> > The reason we added this limit on arm64 is that we noticed programs
> > using the top 8 bits of a 64-bit pointer for additional information.
> > IIRC, it wasn't even openJDK but some JavaScript JIT. We could have
> > taught those programs of a new flag but since we couldn't tell how many
> > are out there, it was the safest to default to a smaller limit and opt
> > in to the higher one. Such opt-in is via mmap() but if you prefer a
> > prctl() flag, that's fine by me as well (though I think this should be
> > opt-in to higher addresses rather than opt-out of the higher addresses).
>
> The mmap() flag was used in previous versions but was decided against
> because this feature is more useful if it is process-wide. A
> personality() flag was chosen instead of a prctl() flag because there
> existed other flags in personality() that were similar. I am tempted to
> use prctl() however because then we could have an additional arg to
> select the exact number of bits that should be reserved (rather than
> being fixed at 47 bits).
I am very much not in favour of a prctl(), it would require us to add state
limiting the address space and the timing of it becomes critical. Then we
have the same issue we do with the other proposals as to - what happens if
this is too low?
What is 'too low' varies by architecture, and for 32-bit architectures
could get quite... problematic.
And again, wha is the RoI here - we introducing maintenance burden and edge
cases vs. the x86 solution in order to... accommodate things that need more
than 128 TiB of address space? A problem that does not appear to exist in
reality?
I suggested the personality approach as the least impactful compromise way
of this series working, but I think after what Arnd has said (and please
forgive me if I've missed further discussion have been dipping in and out
of this!) - adapting risc v to the approach we take elsewhere seems the
most sensible solution to me.
This remains something we can revisit in future if this turns out to be
egregious.
>
> Opting-in to the higher address space is reasonable. However, it is not
> my preference, because the purpose of this flag is to ensure that
> allocations do not exceed 47-bits, so it is a clearer ABI to have the
> applications that want this guarantee to be the ones setting the flag,
> rather than the applications that want the higher bits setting the flag.
Perfect is the enemy of the good :) and an idealised solution may not end
up being something everybody can agree on.
>
> - Charlie
>
> >
> > --
> > Catalin
>
>
>
Powered by blists - more mailing lists