linux-kernel - Re: [RFC PATCH 0/1] KVM-arm: Optimize cache flush by only flushing on vcpu0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87sem5y5s4.wl-maz@kernel.org>
Date: Fri, 18 Apr 2025 14:10:19 +0100
From: Marc Zyngier <maz@...nel.org>
To: Jiayuan Liang <ljykernel@....com>
Cc: Oliver Upton <oliver.upton@...ux.dev>,
	Joey Gouly <joey.gouly@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Zenghui Yu <yuzenghui@...wei.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>,
	linux-arm-kernel@...ts.infradead.org,
	kvmarm@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/1] KVM-arm: Optimize cache flush by only flushing on vcpu0

On Fri, 18 Apr 2025 11:22:43 +0100,
Jiayuan Liang <ljykernel@....com> wrote:
> 
> This is an RFC patch to optimize cache flushing behavior in KVM/arm64.
> 
> When toggling cache state in a multi-vCPU guest, we currently flush the VM's
> stage2 page tables on every vCPU that transitions cache state. This leads to
> redundant cache flushes during guest boot, as each vCPU performs the same
> flush operation.
> 
> In a typical guest boot sequence, vcpu0 is the first to enable caches, and
> other vCPUs follow afterward. By the time secondary vCPUs enable their caches,
> the flush performed by vcpu0 has already ensured cache coherency for the
> entire VM.

The most immediate issue I can spot is that vcpu0 is not special.
There is nothing that says vcpu0 will be the first switching its MMU
on, nor that vcpu0 will ever be running. I guess what you would want
instead is that the *first* vcpu that enables its MMU performs the
CMOs, while the others may not have to.

But even then, this changes a behaviour some guests *may* be relying
on, which is that what they have written while their MMU was off is
visible with the MMU on, without the guest doing any CMO of its own.

A lot of this stuff comes from the days where we were mostly running
32bit guests, some of which had (and still have) pretty bad
assumptions (set/way operations being one of them).

64bit guests *should* be much better behaved, and I wonder whether we
could actually drop the whole thing altogether for those. Something
like the hack below.

But this requires testing and more thought than I'm prepared to on a
day off... ;-)

Thanks,

	M.

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index bd020fc28aa9c..9d05e65433916 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -85,9 +85,11 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	 * For non-FWB CPUs, we trap VM ops (HCR_EL2.TVM) until M+C
 	 * get set in SCTLR_EL1 such that we can detect when the guest
 	 * MMU gets turned on and do the necessary cache maintenance
-	 * then.
+	 * then. Limit this dance to 32bit guests, assuming that 64bit
+	 * guests are reasonably behaved.
 	 */
-	if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+	if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) &&
+	    vcpu_el1_is_32bit(vcpu))
 		vcpu->arch.hcr_el2 |= HCR_TVM;
 }

-- 
Jazz isn't dead. It just smells funny.