[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87sem5y5s4.wl-maz@kernel.org>
Date: Fri, 18 Apr 2025 14:10:19 +0100
From: Marc Zyngier <maz@...nel.org>
To: Jiayuan Liang <ljykernel@....com>
Cc: Oliver Upton <oliver.upton@...ux.dev>,
Joey Gouly <joey.gouly@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Zenghui Yu <yuzenghui@...wei.com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/1] KVM-arm: Optimize cache flush by only flushing on vcpu0
On Fri, 18 Apr 2025 11:22:43 +0100,
Jiayuan Liang <ljykernel@....com> wrote:
>
> This is an RFC patch to optimize cache flushing behavior in KVM/arm64.
>
> When toggling cache state in a multi-vCPU guest, we currently flush the VM's
> stage2 page tables on every vCPU that transitions cache state. This leads to
> redundant cache flushes during guest boot, as each vCPU performs the same
> flush operation.
>
> In a typical guest boot sequence, vcpu0 is the first to enable caches, and
> other vCPUs follow afterward. By the time secondary vCPUs enable their caches,
> the flush performed by vcpu0 has already ensured cache coherency for the
> entire VM.
The most immediate issue I can spot is that vcpu0 is not special.
There is nothing that says vcpu0 will be the first switching its MMU
on, nor that vcpu0 will ever be running. I guess what you would want
instead is that the *first* vcpu that enables its MMU performs the
CMOs, while the others may not have to.
But even then, this changes a behaviour some guests *may* be relying
on, which is that what they have written while their MMU was off is
visible with the MMU on, without the guest doing any CMO of its own.
A lot of this stuff comes from the days where we were mostly running
32bit guests, some of which had (and still have) pretty bad
assumptions (set/way operations being one of them).
64bit guests *should* be much better behaved, and I wonder whether we
could actually drop the whole thing altogether for those. Something
like the hack below.
But this requires testing and more thought than I'm prepared to on a
day off... ;-)
Thanks,
M.
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index bd020fc28aa9c..9d05e65433916 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -85,9 +85,11 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
* For non-FWB CPUs, we trap VM ops (HCR_EL2.TVM) until M+C
* get set in SCTLR_EL1 such that we can detect when the guest
* MMU gets turned on and do the necessary cache maintenance
- * then.
+ * then. Limit this dance to 32bit guests, assuming that 64bit
+ * guests are reasonably behaved.
*/
- if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+ if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) &&
+ vcpu_el1_is_32bit(vcpu))
vcpu->arch.hcr_el2 |= HCR_TVM;
}
--
Jazz isn't dead. It just smells funny.
Powered by blists - more mailing lists