lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240201045551.ajg4iqcajyowl2rh@google.com>
Date: Wed, 31 Jan 2024 20:55:51 -0800
From: Fangrui Song <maskray@...gle.com>
To: Dave Martin <Dave.Martin@....com>
Cc: Ard Biesheuvel <ardb@...nel.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org,
	Jisheng Zhang <jszhang@...nel.org>, llvm@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: jump_label: use constraint "S" instead of "i"

On 2024-01-31, Dave Martin wrote:
>On Wed, Jan 31, 2024 at 08:16:04AM +0100, Ard Biesheuvel wrote:
>> Hello Fangrui,
>>
>> On Wed, 31 Jan 2024 at 07:53, Fangrui Song <maskray@...gle.com> wrote:
>> >
>> > The constraint "i" seems to be copied from x86 (and with a redundant
>> > modifier "c"). It works with -fno-PIE but not with -fPIE/-fPIC in GCC's
>> > aarch64 port.
>
>(I'm not sure of the exact history, but the "c" may be inherited from
>arm, where an output modifier was needed to suppress the "#" that
>prefixes immediates in the traditional asm syntax.  This does not
>actually seem to be required for AArch64: rather while a # is allowed
>and still considered good style in handwritten asm code, the syntax
>doesn't require it, and the compiler doesn't emit it for "i" arguments,
>AFAICT.)

The aarch64 one could be inherited from
arch/arm/include/asm/jump_label.h (2012), which could in turn be
inherited from x86 (2010).
Both the constraint "i" and the modifier "c" are generic..
For -fno-pic this combination can be used for every arch.

>> > The constraint "S", which denotes a symbol reference (e.g. function,
>> > global variable) or label reference, is more appropriate, and has been
>> > available in GCC since 2012 and in Clang since 7.0.
>> >
>> > Signed-off-by: Fangrui Song <maskray@...gle.com>
>> > Link: https://maskray.me/blog/2024-01-30-raw-symbol-names-in-inline-assembly
>> > ---
>> >  arch/arm64/include/asm/jump_label.h | 8 ++++----
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
>> > index 48ddc0f45d22..31862b3bb33d 100644
>> > --- a/arch/arm64/include/asm/jump_label.h
>> > +++ b/arch/arm64/include/asm/jump_label.h
>> > @@ -23,9 +23,9 @@ static __always_inline bool arch_static_branch(struct static_key * const key,
>> >                  "      .pushsection    __jump_table, \"aw\"    \n\t"
>> >                  "      .align          3                       \n\t"
>> >                  "      .long           1b - ., %l[l_yes] - .   \n\t"
>> > -                "      .quad           %c0 - .                 \n\t"
>> > +                "      .quad           %0 - .                  \n\t"
>> >                  "      .popsection                             \n\t"
>> > -                :  :  "i"(&((char *)key)[branch]) :  : l_yes);
>> > +                :  :  "S"(&((char *)key)[branch]) :  : l_yes);
>>
>> 'key' is not used as a raw symbol name. We should make this
>>
>> "    .quad   %0 + %1 - ."
>>
>> and
>>
>> ::  "S"(key), "i"(branch) :: l_yes);
>>
>> if we want to really clean this up.
>
>This hides more logic in the asm so it's arguably more cryptic
>(although the code is fairly cryptic to begin with -- I don't really
>see why the argument wasn't written as the equivalent
>(char *)key + branch...)

I agree that using "S" and "i" would introduce complexity.
Using just "S" as this patch does should be clear.

All of "i" "s" "S" support a symbol or label reference and a constant offset (can be zero),
(in object file, a symbol and an addend; in GCC's term, the sum of a SYMBOL_REF and a CONST_INT).

>Anyway, I don't think the "i" versys "S" distinction makes any
>difference without -fpic or equivalent, so it is not really relevant
>for the kernel (except that "S" breaks compatibility with older
>compilers...)
>
>
>I think the main advantage of "S" is that it stops you accidentally
>emitting undesirable relocations from asm code that is not written for
>the -fpic case.
>
>But just changing "i" to "S" is not sufficient to port asms to -fpic:
>the asms still need to be reviewed.
>
>
>So unless the asm has been reviewed for position-independence, it may
>anyway be better to stick with "i" so that the compiler actually chokes
>if someone tries to build the code with -fpic.

The asm is position-independent.
This `.long sym - .` is a common metadata section pattern to support PIC:)

Regarding the constraints, I've updated
https://maskray.me/blog/2024-01-30-raw-symbol-names-in-inline-assembly
to include more details.

>Since we are not trying to run arbitraily many running kernels in a
>common address space (and not likely to do that), I'm not sure that we
>would ever build the kernel with -fpic except for a few special-case
>bits like the EFI stub and vDSO... unless I've missed something?
>
>If there's another reason why "S" is advantageous though, I'm happy to
>be corrected.

I remember that Ard has an RFC
https://lore.kernel.org/linux-arm-kernel/20220427171241.2426592-1-ardb@kernel.org/
"[RFC PATCH 0/2] arm64: use PIE code generation for KASLR kernel"
and see some recent PIE codegen patches.

> Building the KASLR kernel without -fpie but linking it with -pie works
> in practice, but it is not something that is explicitly supported by the
> toolchains - it happens to work because the default 'small' code model
> used by both GCC and Clang relies mostly on ADRP+ADD/LDR to generate
> symbol references.

I agree that current -fno-PIE with -shared -Bsymbolic linking is a hack
that works as a conincidence, not guaranteed by the toolchain.
This jump_label improvement (with no object file difference) fixes an
obstacle. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ