[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMj1kXEOmvUx8f=_v7_AFhMLobtauSw20t76sEDmzays4NLQnw@mail.gmail.com>
Date: Tue, 26 Aug 2025 15:18:48 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Arnd Bergmann <arnd@...db.de>
Cc: Nathan Chancellor <nathan@...nel.org>, linux-kernel@...r.kernel.org,
Kees Cook <kees@...nel.org>, Nick Desaulniers <nick.desaulniers+lkml@...il.com>,
Bill Wendling <morbo@...gle.com>, Justin Stitt <justinstitt@...gle.com>, llvm@...ts.linux.dev,
patches@...ts.linux.dev, Russell King <linux@...linux.org.uk>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
On Fri, 22 Aug 2025 at 22:04, Arnd Bergmann <arnd@...db.de> wrote:
>
> On Fri, Aug 22, 2025, at 09:05, Arnd Bergmann wrote:
> > On Thu, Aug 21, 2025, at 23:15, Nathan Chancellor wrote:
> >
> > Would it be possible to either change the macro or to move
> > the overflow_stack_ptr closer in order to completely eliminate
> > the CONFIG_ARM_HAS_GROUP_RELOCS symbol and have VMAP_STACK
> > enabled for all CONFIG_MMU builds?
> >
> > Are there any other build testing issues with ARM_HAS_GROUP_RELOCS
> > besides the one I saw here?
>
> With some more randconfig testing, I did come across a few
> configurations that each fail with hundreds of errors like
>
> arm-linux-gnueabi-ld: drivers/crypto/hifn_795x.o(.text+0x99c): overflow whilst splitting 0x10a61854 for group relocation R_ARM_LDR_PC_G2
>
> so I guess we'll have to stick with the current dependency,
> at least for ARMv6 and below.
>
This is due to LOAD_SYM_ARMV6() (rather than the ldr_this_cpu_armv6
asm macro), which is used to implement get_current() on configs that
use a global variable to store the current task pointer (i.e., non-k
v6 and older). It eliminates the first of two LDRs, which would
pollute the D-cache otherwise, as every occurrence of get_current()
emits a literal into .text carrying the address of the __current
global variable. The D-cache footprint of each such literal is a
cacheline, which never contains other useful data.
(The second LDR is needed and always refers to the same address so it
does not impact D-cache efficiency)
The LOAD_SYM_ARMV6() sequence has a range of 256 MiB, which is
sufficient for any ARM kernel that can be meaningfully used in
production. However, randconfigs may produce kernels that are larger
than this, and so we need the COMPILE_TEST check if we are going to
keep the optimization, and I think it is meaningful enough to do so.
Powered by blists - more mailing lists