lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXEBjumu3VUySg1cQ+SYm4MugJ5f6pd1f7C0XrLyOWAoOw@mail.gmail.com>
Date: Thu, 1 Feb 2024 09:08:22 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Fangrui Song <maskray@...gle.com>
Cc: Dave Martin <Dave.Martin@....com>, Catalin Marinas <catalin.marinas@....com>, 
	Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org, 
	Jisheng Zhang <jszhang@...nel.org>, llvm@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: jump_label: use constraint "S" instead of "i"

On Thu, 1 Feb 2024 at 05:55, Fangrui Song <maskray@...gle.com> wrote:
>
> On 2024-01-31, Dave Martin wrote:
> >On Wed, Jan 31, 2024 at 08:16:04AM +0100, Ard Biesheuvel wrote:
> >> Hello Fangrui,
> >>
> >> On Wed, 31 Jan 2024 at 07:53, Fangrui Song <maskray@...gle.com> wrote:
> >> >
> >> > The constraint "i" seems to be copied from x86 (and with a redundant
> >> > modifier "c"). It works with -fno-PIE but not with -fPIE/-fPIC in GCC's
> >> > aarch64 port.
> >
> >(I'm not sure of the exact history, but the "c" may be inherited from
> >arm, where an output modifier was needed to suppress the "#" that
> >prefixes immediates in the traditional asm syntax.  This does not
> >actually seem to be required for AArch64: rather while a # is allowed
> >and still considered good style in handwritten asm code, the syntax
> >doesn't require it, and the compiler doesn't emit it for "i" arguments,
> >AFAICT.)
>
> The aarch64 one could be inherited from
> arch/arm/include/asm/jump_label.h (2012), which could in turn be
> inherited from x86 (2010).
> Both the constraint "i" and the modifier "c" are generic..
> For -fno-pic this combination can be used for every arch.
>
> >> > The constraint "S", which denotes a symbol reference (e.g. function,
> >> > global variable) or label reference, is more appropriate, and has been
> >> > available in GCC since 2012 and in Clang since 7.0.
> >> >
> >> > Signed-off-by: Fangrui Song <maskray@...gle.com>
> >> > Link: https://maskray.me/blog/2024-01-30-raw-symbol-names-in-inline-assembly
> >> > ---
> >> >  arch/arm64/include/asm/jump_label.h | 8 ++++----
> >> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
> >> > index 48ddc0f45d22..31862b3bb33d 100644
> >> > --- a/arch/arm64/include/asm/jump_label.h
> >> > +++ b/arch/arm64/include/asm/jump_label.h
> >> > @@ -23,9 +23,9 @@ static __always_inline bool arch_static_branch(struct static_key * const key,
> >> >                  "      .pushsection    __jump_table, \"aw\"    \n\t"
> >> >                  "      .align          3                       \n\t"
> >> >                  "      .long           1b - ., %l[l_yes] - .   \n\t"
> >> > -                "      .quad           %c0 - .                 \n\t"
> >> > +                "      .quad           %0 - .                  \n\t"
> >> >                  "      .popsection                             \n\t"
> >> > -                :  :  "i"(&((char *)key)[branch]) :  : l_yes);
> >> > +                :  :  "S"(&((char *)key)[branch]) :  : l_yes);
> >>
> >> 'key' is not used as a raw symbol name. We should make this
> >>
> >> "    .quad   %0 + %1 - ."
> >>
> >> and
> >>
> >> ::  "S"(key), "i"(branch) :: l_yes);
> >>
> >> if we want to really clean this up.
> >
> >This hides more logic in the asm so it's arguably more cryptic
> >(although the code is fairly cryptic to begin with -- I don't really
> >see why the argument wasn't written as the equivalent
> >(char *)key + branch...)
>
> I agree that using "S" and "i" would introduce complexity.
> Using just "S" as this patch does should be clear.
>
> All of "i" "s" "S" support a symbol or label reference and a constant offset (can be zero),
> (in object file, a symbol and an addend; in GCC's term, the sum of a SYMBOL_REF and a CONST_INT).
>

Taken the address of a struct, cast it to char[] and then index it
using a boolean is rather disgusting, no?

> >Anyway, I don't think the "i" versys "S" distinction makes any
> >difference without -fpic or equivalent, so it is not really relevant
> >for the kernel (except that "S" breaks compatibility with older
> >compilers...)
> >
> >
> >I think the main advantage of "S" is that it stops you accidentally
> >emitting undesirable relocations from asm code that is not written for
> >the -fpic case.
> >
> >But just changing "i" to "S" is not sufficient to port asms to -fpic:
> >the asms still need to be reviewed.
> >
> >
> >So unless the asm has been reviewed for position-independence, it may
> >anyway be better to stick with "i" so that the compiler actually chokes
> >if someone tries to build the code with -fpic.
>

The virtual address of the kernel is randomized by KASLR, which relies
on PIE linking, and this puts constraints on the permitted types of
relocations.

IOW, we basically already build the kernel as PIC code, but without
relying on -fPIC, because that triggers some behaviors that only make
sense for shared objects in user space.

> >Since we are not trying to run arbitraily many running kernels in a
> >common address space (and not likely to do that), I'm not sure that we
> >would ever build the kernel with -fpic except for a few special-case
> >bits like the EFI stub and vDSO... unless I've missed something?
> >

Yes, KASLR. The number of kernels is not the point, the point is that
the virtual load address of the kernel is usually decided at boot, and
so the code needs to be generated to accommodate that.

> >If there's another reason why "S" is advantageous though, I'm happy to
> >be corrected.
>
> I remember that Ard has an RFC
> https://lore.kernel.org/linux-arm-kernel/20220427171241.2426592-1-ardb@kernel.org/
> "[RFC PATCH 0/2] arm64: use PIE code generation for KASLR kernel"
> and see some recent PIE codegen patches.
>
> > Building the KASLR kernel without -fpie but linking it with -pie works
> > in practice, but it is not something that is explicitly supported by the
> > toolchains - it happens to work because the default 'small' code model
> > used by both GCC and Clang relies mostly on ADRP+ADD/LDR to generate
> > symbol references.
>
> I agree that current -fno-PIE with -shared -Bsymbolic linking is a hack
> that works as a conincidence, not guaranteed by the toolchain.
> This jump_label improvement (with no object file difference) fixes an
> obstacle.

If we can get the guaranteed behavior of #pragma GCC visibility
push(hidden) from a command line option, we should build the core
kernel with -fpie instead. (Modules are partially linked objects, so
they can be built non-PIC as before)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ