lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ze-hC8NozVbOQQIT@google.com>
Date: Mon, 11 Mar 2024 17:25:47 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Lai Jiangshan <jiangshanlai@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Josh Triplett <josh@...htriplett.org>, kvm@...r.kernel.org, 
	rcu@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Kevin Tian <kevin.tian@...el.com>, Yiwei Zhang <zzyiwei@...gle.com>
Subject: Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop

On Mon, Mar 11, 2024, Yan Zhao wrote:
> On Fri, Mar 08, 2024 at 05:09:29PM -0800, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 17a8e4fdf9c4..5dc4c24ae203 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7605,11 +7605,13 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> >  
> >  	/*
> >  	 * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
> > -	 * device attached.  Letting the guest control memory types on Intel
> > -	 * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
> > -	 * the guest to behave only as a last resort.
> > +	 * device attached and the CPU doesn't support self-snoop.  Letting the
> > +	 * guest control memory types on Intel CPUs without self-snoop may
> > +	 * result in unexpected behavior, and so KVM's (historical) ABI is to
> > +	 * trust the guest to behave only as a last resort.
> >  	 */
> > -	if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
> > +	if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
> > +	    !kvm_arch_has_noncoherent_dma(vcpu->kvm))
> >  		return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
> 
> For the case of !static_cpu_has(X86_FEATURE_SELFSNOOP) &&
> kvm_arch_has_noncoherent_dma(vcpu->kvm), I think we at least should warn
> about unsafe before honoring guest memory type.

I don't think it gains us enough to offset the potential pain such a message
would bring.  Assuming the warning isn't outright ignored, the most likely scenario
is that the warning will cause random end users to worry that the setup they've
been running for years is broken, when in reality it's probably just fine for their
use case.

I would be quite surprised if there are people running untrusted workloads on
10+ year old silicon *and* have passthrough devices and non-coherent IOMMUs/DMA.
And anyone exposing a device directly to an untrusted workload really should have
done their homework.

And it's not like we're going to change KVM's historical behavior at this point.

> Though it's a KVM's historical ABI, it's not safe in the security perspective
> because page aliasing without proper cache flush handling on CPUs without
> self-snoop may open a door for guest to read uninitialized host data.
> e.g. when there's a noncoherent DMA device attached, and if there's a memory
> region that is not pinned in vfio/iommufd side, (e.g. memory region in vfio's
> skipped section), then though the guest memory from this memory region is not
> accessible to noncoherent DMAs, vCPUs can still access this part of guest memory.
> Then if vCPUs use WC as guest type, it may bypass host's initialization data in
> cache and read stale data in host, causing information leak.
> 
> My preference is still to force WB
> (i.e. (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT) in case of
> !static_cpu_has(X86_FEATURE_SELFSNOOP) && kvm_arch_has_noncoherent_dma(vcpu->kvm).
> Firstly, it's because there're few CPUs with features VMX without self-snoop;

This is unfortunately not true.  I don't know the details, but apparently all
Intel CPUs before Ivy Bridge had a one or more related errata.

/*
 * Processors which have self-snooping capability can handle conflicting
 * memory type across CPUs by snooping its own cache. However, there exists
 * CPU models in which having conflicting memory types still leads to
 * unpredictable behavior, machine check errors, or hangs. Clear this
 * feature to prevent its use on machines with known erratas.
 */
static void check_memory_type_self_snoop_errata(struct cpuinfo_x86 *c)
{
	switch (c->x86_model) {
	case INTEL_FAM6_CORE_YONAH:
	case INTEL_FAM6_CORE2_MEROM:
	case INTEL_FAM6_CORE2_MEROM_L:
	case INTEL_FAM6_CORE2_PENRYN:
	case INTEL_FAM6_CORE2_DUNNINGTON:
	case INTEL_FAM6_NEHALEM:
	case INTEL_FAM6_NEHALEM_G:
	case INTEL_FAM6_NEHALEM_EP:
	case INTEL_FAM6_NEHALEM_EX:
	case INTEL_FAM6_WESTMERE:
	case INTEL_FAM6_WESTMERE_EP:
	case INTEL_FAM6_SANDYBRIDGE:
		setup_clear_cpu_cap(X86_FEATURE_SELFSNOOP);
	}
}

> Secondly, security takes priority over functionality :)

Yeah, but not breaking userspace for setups that have existed for 10+ years takes
priority over all of that :-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ