linux-kernel - Re: [PATCH v3 2/5] arm64: initialise SCTLR2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKYDSOdW3kGlpkxv@e129823.arm.com>
Date: Wed, 20 Aug 2025 18:18:00 +0100
From: Yeoreum Yun <yeoreum.yun@....com>
To: Dave Martin <Dave.Martin@....com>
Cc: catalin.marinas@....com, will@...nel.org, broonie@...nel.org,
	oliver.upton@...ux.dev, anshuman.khandual@....com, robh@...nel.org,
	james.morse@....com, mark.rutland@....com, joey.gouly@....com,
	ahmed.genidi@....com, kevin.brodsky@....com,
	scott@...amperecomputing.com, mbenes@...e.cz,
	james.clark@...aro.org, frederic@...nel.org, rafael@...nel.org,
	pavel@...nel.org, ryan.roberts@....com, suzuki.poulose@....com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-pm@...r.kernel.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v3 2/5] arm64: initialise SCTLR2_ELx register at boot time

Hi Dave,

>
> On Wed, Aug 13, 2025 at 01:01:15PM +0100, Yeoreum Yun wrote:
> > SCTLR2_ELx register is optional starting from ARMv8.8/ARMv9.3,
> > and becomes mandatory from ARMv8.9/ARMv9.4
> > and serveral architectural feature are controled by bits in
> > these registers.
> >
> > These register's value is UNKNOWN when it was reset
> > It wasn't need to be initialised if firmware initilises these registers
> > properly.
> > But for the case not initialised properly, initialise SCTLR2_ELx
> > registers at bootin cpu/vcpu so that
> > unexpected system behavior couldn't happen for
> > improper SCTLR2_ELx value.
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@....com>
> > ---
>
> Please note significant changes under the tearoff line or in the cover
> letter.
>
> (Merging the __kvm_host_psci_cpu_entry() changes into this patch is a
> significant change.)
>
> >  arch/arm64/include/asm/assembler.h   | 22 ++++++++++++++++++++++
> >  arch/arm64/include/asm/el2_setup.h   |  6 ++++++
> >  arch/arm64/include/asm/sysreg.h      |  5 +++++
> >  arch/arm64/kernel/head.S             |  5 +++++
> >  arch/arm64/kernel/hyp-stub.S         | 10 ++++++++++
> >  arch/arm64/kvm/hyp/nvhe/psci-relay.c |  3 +++
> >  6 files changed, 51 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> > index 23be85d93348..eef169c105f0 100644
> > --- a/arch/arm64/include/asm/assembler.h
> > +++ b/arch/arm64/include/asm/assembler.h
> > @@ -738,6 +738,28 @@ alternative_endif
> >  	set_sctlr sctlr_el2, \reg
> >  .endm
> >
> > +	/*
> > +	 * Set SCTLR2_ELx to the @reg value.
> > +	 */
>
> One-line comment preferred:
>
> 	/* Set SCTLR2_ELx to the @reg value. */
>
> (when the comment is short enough to fit).
>
>
> Nit: can you follow the same indentation pattern as the "set_sctlr"
> macros that appear above?
>
> (I know, cond_yield etc. do things differently.  But the sctlr
> accessors feel like they should belong together and look similar (as
> far as possible).)

Thanks :) I'll change it.

>
>
> > +	.macro __set_sctlr2_elx, el, reg, tmp
> > +	mrs_s	\tmp, SYS_ID_AA64MMFR3_EL1
> > +	ubfx	\tmp, \tmp, #ID_AA64MMFR3_EL1_SCTLRX_SHIFT, #4
> > +	cbz	\tmp, .Lskip_sctlr2_\@
> > +	.if	\el == 2
> > +	msr_s	SYS_SCTLR_EL2, \reg
>
> Er, should this be SYS_SCTLR2_EL2 ?!

Oops. I was lucky when I test it. Thanks for catching this!

>
> To make the code more readable at the call site, would it make sense
> to have:
>
> 	msr_s	SYS_SCTLR2_\el, \reg
>
> ...?
>
> This would mean that you can get rid of the lengthy .if...else, and the
> assembler will report an error if the el argument is junk or missing.
> (The argument would need to have an explicit EL prefix at the call
> site.)
>
> (At the moment, the code silently uses SCTLR(2)_EL1 when the el
> argument is junk, which is not ideal.  Where it is easy to detect
> stupid errors by the caller at build time, we should try to do it.)


That was what I tried However, msr_s is also macro.
Unfortunately If I apply your suggestion, SYS_SCTLR2_EL2 doesn't expand.
IOW this macro expand like:
  emit(SYS_SCTLR2_EL2) // -> not the encoded hex of SYS_SCTLR2_EL2

So, I've used this way..

>
> > +	.elseif	\el == 12
> > +	msr_s	SYS_SCTLR_EL12, \reg
> > +	.else
> > +	msr_s	SYS_SCTLR_EL1, \reg
> > +	.endif
> > +.Lskip_sctlr2_\@:
> > +	.endm
> > +
> > +	.macro set_sctlr2_elx, el, reg, tmp
> > +	__set_sctlr2_elx	\el, \reg, \tmp
> > +	isb
> > +	.endm
> > +
>
> Maybe we don't need two macros here?
>
> __set_sctlr2_elx seems only to be used in one place.
>
> The isb is needed when setting SCTLR2_ELx for the current EL, but
> the code at the call site knows whether this is the case.  It is
> hard for the macro to know, and it feels overcomplicated to
> explicitly check CurrentEL here.
>
> I think it may be better to open-code the isb in the places where it
> matters, including inside the __init_sctlr2_el2 macro.  (Compare with
> the other __init_foo macros here.)

I see. Thanks for clarification!

>
> >  	/*
> >  	 * Check whether asm code should yield as soon as it is able. This is
> >  	 * the case if we are currently running in task context, and the
> > diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> > index d755b4d46d77..c03cabd45fcf 100644
> > --- a/arch/arm64/include/asm/el2_setup.h
> > +++ b/arch/arm64/include/asm/el2_setup.h
> > @@ -48,6 +48,11 @@
> >  	isb
> >  .endm
> >
> > +.macro __init_sctlr2_el2
> > +	mov_q	x0, INIT_SCTLR2_EL2
> > +	set_sctlr2_elx	2, x0, x1
> > +.endm
> > +
> >  .macro __init_el2_hcrx
> >  	mrs	x0, id_aa64mmfr1_el1
> >  	ubfx	x0, x0, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
> > @@ -411,6 +416,7 @@
> >   */
> >  .macro init_el2_state
> >  	__init_el2_sctlr
> > +	__init_sctlr2_el2
> >  	__init_el2_hcrx
> >  	__init_el2_timers
> >  	__init_el2_debug
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index d5b5f2ae1afa..0431b357b87b 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -868,6 +868,8 @@
> >  #define INIT_SCTLR_EL2_MMU_OFF \
> >  	(SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
> >
> > +#define INIT_SCTLR2_EL2			UL(0)
> > +
> >  /* SCTLR_EL1 specific flags. */
> >  #ifdef CONFIG_CPU_BIG_ENDIAN
> >  #define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
> > @@ -888,6 +890,8 @@
> >  	 SCTLR_EL1_LSMAOE | SCTLR_EL1_nTLSMD | SCTLR_EL1_EIS   | \
> >  	 SCTLR_EL1_TSCXT  | SCTLR_EL1_EOS)
> >
> > +#define INIT_SCTLR2_EL1			UL(0)
> > +
> >  /* MAIR_ELx memory attributes (used by Linux) */
> >  #define MAIR_ATTR_DEVICE_nGnRnE		UL(0x00)
> >  #define MAIR_ATTR_DEVICE_nGnRE		UL(0x04)
> > @@ -1164,6 +1168,7 @@
> >  	msr	hcr_el2, \reg
> >  #endif
> >  	.endm
> > +
> >  #else
> >
> >  #include <linux/bitfield.h>
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index ca04b338cb0d..c41015675eae 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -276,6 +276,8 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> >  	mov_q	x0, INIT_SCTLR_EL1_MMU_OFF
> >  	pre_disable_mmu_workaround
> >  	msr	sctlr_el1, x0
> > +	mov_q	x0, INIT_SCTLR2_EL1
> > +	__set_sctlr2_elx	1, x0, x1
> >  	isb
> >  	mov_q	x0, INIT_PSTATE_EL1
> >  	msr	spsr_el1, x0
> > @@ -308,6 +310,7 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> >  	isb
> >
> >  	mov_q	x1, INIT_SCTLR_EL1_MMU_OFF
> > +	mov_q	x2, INIT_SCTLR2_EL1
> >
> >  	mrs	x0, hcr_el2
> >  	and	x0, x0, #HCR_E2H
> > @@ -315,11 +318,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> >
> >  	/* Set a sane SCTLR_EL1, the VHE way */
> >  	msr_s	SYS_SCTLR_EL12, x1
> > +	__set_sctlr2_elx	12, x2, x0
> >  	mov	x2, #BOOT_CPU_FLAG_E2H
> >  	b	3f
> >
> >  2:
> >  	msr	sctlr_el1, x1
> > +	__set_sctlr2_elx	1, x2, x0
> >  	mov	x2, xzr
> >  3:
> >  	mov	x0, #INIT_PSTATE_EL1
> > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> > index 36e2d26b54f5..ac12f1b4f8e2 100644
> > --- a/arch/arm64/kernel/hyp-stub.S
> > +++ b/arch/arm64/kernel/hyp-stub.S
> > @@ -144,7 +144,17 @@ SYM_CODE_START_LOCAL(__finalise_el2)
> >
> >  .Lskip_indirection:
> >  .Lskip_tcr2:
> > +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> > +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_SCTLRX_SHIFT, #4
> > +	cbz	x1, .Lskip_sctlr2
> > +	mrs_s	x1, SYS_SCTLR2_EL12
> > +	msr_s	SYS_SCTLR2_EL1, x1
>
> Is this code only applicable to the VHE case?  (I think this is
> probably true, but it's not completely obvious just from the context in
> this patch...)
>
> Anyway, the pattern in this function is to transfer the "same EL"
> controls over from the kernel into EL2, which is what the block above
> does.
>
>
> The next bit of code looks strange, though.
>
> If we have committed to running the kernel at EL2, why does the code
> reinitialise SCTLR2_EL1 here:

When init_el2 returns regardless of VHE or not, it runs at EL1 by

  mov  x0, #INIT_PSTATE_EL1
  msr  spsr_el2, x0

And as SCTLR_EL1 is initialise by consequece code, and this syncs at
finalise_el2. and This will be the same for the SCTLR2_EL1
(although there's no modification of SCTLR2_EL1, but in the feature
that would be).

So, In case of VHE case, the SCTLR2_EL1 which was set by the code before
finalise_el2 will set this value. and it writes to SCTLR2_EL2.
since SCTLR2_EL1 is initilised for SCTLR2_EL2, this should be cleared.

>
> > +	// clean SCTLR2_EL1
> > +	mov_q	x1, INIT_SCTLR2_EL1
> > +	msr_s	SYS_SCTLR2_EL12, x1
> > +
> > +.Lskip_sctlr2:
> >  	isb
> >
> >  	// Hack the exception return to stay at EL2
> > diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > index c3e196fb8b18..4ed4b7fa57c2 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > @@ -4,6 +4,7 @@
> >   * Author: David Brazdil <dbrazdil@...gle.com>
> >   */
> >
> > +#include <asm/alternative.h>
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_hyp.h>
> >  #include <asm/kvm_mmu.h>
> > @@ -219,6 +220,8 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
> >  		release_boot_args(boot_args);
> >
> >  	write_sysreg_el1(INIT_SCTLR_EL1_MMU_OFF, SYS_SCTLR);
> > +	if (alternative_has_cap_unlikely(ARM64_HAS_SCTLR2))
> > +		write_sysreg_el1(INIT_SCTLR2_EL1, SYS_SCTLR2);
>
> Maybe drop the "unlikely" in this case?
>
> Compared with context switch, this is a slow path -- so keeping the
> code as simple as possible is desirable so long as the overall effect
> on performance is not significant.
>
> (It works either way, though.)

Okay. I'll use as likely.

Thanks!
>
> Cheers
> ---Dave

--
Sincerely,
Yeoreum Yun