[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLgbvWYeCr5l1MF6@e133380.arm.com>
Date: Wed, 3 Sep 2025 11:43:09 +0100
From: Dave Martin <Dave.Martin@....com>
To: Yeoreum Yun <yeoreum.yun@....com>
Cc: catalin.marinas@....com, will@...nel.org, broonie@...nel.org,
oliver.upton@...ux.dev, anshuman.khandual@....com, robh@...nel.org,
james.morse@....com, mark.rutland@....com, joey.gouly@....com,
ahmed.genidi@....com, kevin.brodsky@....com,
scott@...amperecomputing.com, mbenes@...e.cz,
james.clark@...aro.org, frederic@...nel.org, rafael@...nel.org,
pavel@...nel.org, ryan.roberts@....com, suzuki.poulose@....com,
maz@...nel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
kvmarm@...ts.linux.dev
Subject: Re: [PATCH v4 2/5] arm64: initialise SCTLR2_ELx register at boot time
Hi,
On Tue, Sep 02, 2025 at 12:05:50PM +0100, Yeoreum Yun wrote:
> Hi Dave,
>
> [...]
>
> > > > > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> > > > > index 36e2d26b54f5..ac12f1b4f8e2 100644
> > > > > --- a/arch/arm64/kernel/hyp-stub.S
> > > > > +++ b/arch/arm64/kernel/hyp-stub.S
> > > > > @@ -144,7 +144,17 @@ SYM_CODE_START_LOCAL(__finalise_el2)
> > > > >
> > > > > .Lskip_indirection:
> > > > > .Lskip_tcr2:
> > > > > + mrs_s x1, SYS_ID_AA64MMFR3_EL1
> > > > > + ubfx x1, x1, #ID_AA64MMFR3_EL1_SCTLRX_SHIFT, #4
> > > > > + cbz x1, .Lskip_sctlr2
> > > > > + mrs_s x1, SYS_SCTLR2_EL12
> > > > > + msr_s SYS_SCTLR2_EL1, x1
> > > > >
> > > > > + // clean SCTLR2_EL1
> > > > > + mov_q x1, INIT_SCTLR2_EL1
> > > > > + msr_s SYS_SCTLR2_EL12, x1
> > > >
> > > > I'm still not sure why we need to do this. The code doesn't seem to
> > > > clean up by the EL1 value of any other register -- or have I missed
> > > > something?
> > > >
> > > > We have already switched to EL2, via the HVC call that jumped to
> > > > __finalise_el2. We won't run at EL1 again unless KVM starts a guest --
> > > > but in that case, it's KVM's responsibility to set up the EL1 registers
> > > > before handing control to the guest.
> > > >
> > > > In any case, is SCTLR2_EL1 ever set to anything except INIT_SCTLR2_EL1
> > > > before we get here?
[...]
> When I look at init_el2(), it returns to EL1 via:
>
> mov x0, #INIT_PSTATE_EL1
> msr spsr_el2, x0
> ...
> eret
>
> In other words, from init_kernel_el() through finalise_el2(),
> all system-register accesses are made at EL1 (i.e., SYS_REG_EL1).
> During this period, it appears that only SCTLR_EL1 is modified,
> so the code only needs to care about the accessed register — SCTLR_EL1.
>
> That’s why SCTLR_EL1 is reinitialised at the end of finalise_el2().
> Otherwise, the MMU bit might remain enabled, which could cause errors later
> when launching a VM under VHE.
>
> However, the idea behind this patch is to initialise SCTLR2_ELx
> the same way as SCTLR_ELx.
> I’m not sure whether SCTLR2_ELx is modified during this period.
> If it is (now or in the future),
> it should be cleared/reinitialised just like SCTLR_EL1.
>
> This patch is based on the assumption that there may be modifications to
> SCTLR2_ELx during this period. So it isn’t about other system registers;
> it’s about the register actually used during this period.
>
> Am I missing anything?
>
> Thanks!
>
> --
> Sincerely,
> Yeoreum Yun
I think I missed the SCTLR_EL1 reset in the idmap code after the
enter_vhe label.
Actually, I'm not sure whether there is any architectural reason for
setting SCTLR_EL1 to INIT_SCTLR_EL1_MMU_OFF here. "for good measure"
suggests that it felt like a good idea but there was no known reason
for it. The commit message for the original patch doesn't offer an
explanation -- maybe Marc can remember.
This might be a defence against speculative translation table walks
using the EL1&0 regime (but the architecture says [RNRJPP]: "If an
implementation is executing at EL3 or EL2, the PE is not permitted to
use the registers associated with the EL1&0 translation regime to
speculatively access memory or translation tables.") So it shouldn't
really matter, but in case buggy CPUs don't implement this rule
properly it may be a good idea to turn the stage1 MMU off just in case.
Since it's there, though, it probably does make sense to reinitialise
SCTLR2_EL1 at the same time -- but can you move this so that it is next
to the SCTLR_EL1 reinitialisation? Otherwise, the purpose of
reinitialising SCTLR2_EL1 is unclear. It really should come under the
same "for good measure" justification as the SCTLR_EL1 reset.
However, I don't think this has anything to do with putting things into
a clean state for VMs. KVM defines the reset state for all the _EL1
regs explicitly -- failing to do that would be a bug in KVM.
(See arch/arm64/kvm/sys_regs.c : sys_reg_descs[], kvm_reset_sys_regs().)
Cheers
---Dave
Powered by blists - more mailing lists