[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a10d63840b02c4bbe1c027e5f230e4799f87ddda.camel@intel.com>
Date: Mon, 27 Mar 2023 09:54:40 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "isaku.yamahata@...il.com" <isaku.yamahata@...il.com>
CC: "Christopherson,, Sean" <seanjc@...gle.com>,
"Shahar, Sagi" <sagis@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Aktas, Erdem" <erdemaktas@...gle.com>,
"dmatlack@...gle.com" <dmatlack@...gle.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"zhi.wang.linux@...il.com" <zhi.wang.linux@...il.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>
Subject: Re: [PATCH v13 057/113] KVM: TDX: MTRR: implement get_mt_mask() for
TDX
On Fri, 2023-03-24 at 18:12 -0700, Isaku Yamahata wrote:
> On Thu, Mar 16, 2023 at 10:38:02AM +0000,
> "Huang, Kai" <kai.huang@...el.com> wrote:
>
> > On Sun, 2023-03-12 at 10:56 -0700, isaku.yamahata@...el.com wrote:
> > > From: Isaku Yamahata <isaku.yamahata@...el.com>
> > >
> > > Because TDX virtualize cpuid[0x1].EDX[MTRR: bit 12] to fixed 1, guest TD
> > > thinks MTRR is supported. Although TDX supports only WB for private GPA,
> > > it's desirable to support MTRR for shared GPA. As guest access to MTRR
> > > MSRs causes #VE and KVM/x86 tracks the values of MTRR MSRs, the remining
> > > part is to implement get_mt_mask method for TDX for shared GPA.
> > >
> > > Pass around shared bit from kvm fault handler to get_mt_mask method so that
> > > it can determine if the gfn is shared or private.
> > >
> >
> > I think we have an Xarray to query whether a given GFN is shared or private?
> > Can we use that?
> >
> > > Implement get_mt_mask()
> > > following vmx case for shared GPA and return WB for private GPA.
> > > the existing vmx_get_mt_mask() can't be directly used as CPU state(CR0.CD)
> > > is protected. GFN passed to kvm_mtrr_check_gfn_range_consistency() should
> > > include shared bit.
> > >
> > > Suggested-by: Kai Huang <kai.huang@...el.com>
> >
> > I am not sure what is suggested by me?
> >
> > I thought what I suggested is we should have a dedicated patch to handle MTRR
> > for TDX putting all related things together.
>
> Sure. After looking at the specs, my conclusion is that guest TD can't update
> MTRR reliably. Because MTRR update requires to disable cache(CR0.CR=1), cache
> flush, and tlb flush. (SDM 3a 12.11.7: MTRR maintenance programming interface)
> Linux implements MTRR update logic according to it.
>
> While TDX enforces CR0.CD=0, trying to set CR0.CD=1 results in #GP. At the
> same time, it reports that MTRR is available via cpuid. So I come up with the
> followings.
>
> - MTRRCap(RO): report no feature available
> SMRR=0: SMRR interface unsupported
> WC=0: write combining unsupported
> FIX=0: Fixed range registers unsupported
> VCNT=0: number of variable range regitsers = 0
>
> - MTRRDefType(R/W): Only writeback even with reset state.
> E=1: enable MTRR (E=0 means all memory is UC.)
> FE=0: disable fixed range MTRRs
> type: default memory type=writeback
> Accept write this value. Other value results in #GP.
>
> - Fixed range MTRR
> #GP as fixed range MTRRs is reported as unsupported
>
> - Variable range MTRRs
> #GP as the number of variable range MTRRs is reported as zero
Looks sane to me.
>
>
> > > +u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> > > +{
> > > + /* TDX private GPA is always WB. */
> > > + if (!(gfn & kvm_gfn_shared_mask(vcpu->kvm))) {
> > > + /* MMIO is only for shared GPA. */
> > > + WARN_ON_ONCE(is_mmio);
> > > + return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT;
> > > + }
> > > +
> > > + /* Drop shared bit as MTRR doesn't know about shared bit. */
> > > + gfn = kvm_gfn_to_private(vcpu->kvm, gfn);
> > > +
> > > + /* As TDX enforces CR0.CD to 0, pass check_cr0_cd = false. */
> > > + return __vmx_get_mt_mask(vcpu, gfn, is_mmio, false);
> > > +}
> >
> >
> > Do you know whether there's any use case of non-coherent device assignment to
> > TDX guest?
> >
> > IMHO, we should just disallow TDX guest to support non-coherent device
> > assignment, so that we can just return WB for both private and shared.
> >
> > If we support non-coherent device assignment, then if guest sets private memory
> > to non-WB memory, it believes the memory type is non-WB, but in fact TDX always
> > map private memory as WB.
> >
> > Will this be a problem, i.e. if assigned device can DMA to private memory
> > directly in the future?
>
> MTRR is legacy feature and PAT replaced it. We can rely on guest to use PAT.
> Here is the new patch for MTRR.
>
> --- a/arch/x86/kvm/vmx/main.c
> +++ b/arch/x86/kvm/vmx/main.c
> @@ -229,6 +229,14 @@ static void vt_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
> vmx_load_mmu_pgd(vcpu, root_hpa, pgd_level);
> }
>
> +static u8 vt_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> +{
> + if (is_td_vcpu(vcpu))
> + return tdx_get_mt_mask(vcpu, gfn, is_mmio);
> +
> + return vmx_get_mt_mask(vcpu, gfn, is_mmio);
> +}
> +
> static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
> {
> if (!is_td(kvm))
> @@ -349,7 +357,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
>
> .set_tss_addr = vmx_set_tss_addr,
> .set_identity_map_addr = vmx_set_identity_map_addr,
> - .get_mt_mask = vmx_get_mt_mask,
> + .get_mt_mask = vt_get_mt_mask,
>
> .get_exit_info = vmx_get_exit_info,
>
> diff -u b/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> --- b/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -347,6 +347,25 @@
> return 0;
> }
>
> +u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> +{
> + /* TDX private GPA is always WB. */
> + if (!(gfn & kvm_gfn_shared_mask(vcpu->kvm))) {
Are you still passing a "raw" GFN? Could you explain why you choose this way?
> + /* MMIO is only for shared GPA. */
> + WARN_ON_ONCE(is_mmio);
> + return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT;
I guess it's better to include VMX_EPT_IPAT_BIT bit.
> + }
> +
> + if (is_mmio)
> + return MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT;
> +
> + if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
> + return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
> +
> + /* TDX enforces CR0.CD = 0 and KVM MTRR emulation enforces writeback. */
> + return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT;
> +}
> +
Powered by blists - more mailing lists