[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9527f12-f3ad-c6b9-2967-5d708d69d937@amd.com>
Date: Mon, 19 Jul 2021 09:19:33 -0500
From: Brijesh Singh <brijesh.singh@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: brijesh.singh@....com, x86@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
Tom Lendacky <thomas.lendacky@....com>,
"H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dov Murik <dovmurik@...ux.ibm.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Borislav Petkov <bp@...en8.de>,
Michael Roth <michael.roth@....com>,
Vlastimil Babka <vbabka@...e.cz>, tony.luck@...el.com,
npmccallum@...hat.com, brijesh.ksingh@...il.com
Subject: Re: [PATCH Part2 RFC v4 33/40] KVM: SVM: Add support to handle MSR
based Page State Change VMGEXIT
On 7/16/21 4:00 PM, Sean Christopherson wrote:
> On Wed, Jul 07, 2021, Brijesh Singh wrote:
>> +static int __snp_handle_psc(struct kvm_vcpu *vcpu, int op, gpa_t gpa, int level)
>
> I can live with e.g. GHCB_MSR_PSC_REQ, but I'd strongly prefer to spell this out,
> e.g. __snp_handle_page_state_change() or whatever. I had a hell of a time figuring
> out what PSC was the first time I saw it in some random context.
Based on the previous review feedback I renamed from
__snp_handle_page_state_change to __snp_handle_psc(). I will see what
others say and based on that will rename accordingly.
>
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + int rc, tdp_level;
>> + kvm_pfn_t pfn;
>> + gpa_t gpa_end;
>> +
>> + gpa_end = gpa + page_level_size(level);
>> +
>> + while (gpa < gpa_end) {
>> + /*
>> + * Get the pfn and level for the gpa from the nested page table.
>> + *
>> + * If the TDP walk failed, then its safe to say that we don't have a valid
>> + * mapping for the gpa in the nested page table. Create a fault to map the
>> + * page is nested page table.
>> + */
>> + if (!kvm_mmu_get_tdp_walk(vcpu, gpa, &pfn, &tdp_level)) {
>> + pfn = kvm_mmu_map_tdp_page(vcpu, gpa, PFERR_USER_MASK, level);
>> + if (is_error_noslot_pfn(pfn))
>> + goto out;
>> +
>> + if (!kvm_mmu_get_tdp_walk(vcpu, gpa, &pfn, &tdp_level))
>> + goto out;
>> + }
>> +
>> + /* Adjust the level so that we don't go higher than the backing page level */
>> + level = min_t(size_t, level, tdp_level);
>> +
>> + write_lock(&kvm->mmu_lock);
>
> Retrieving the PFN and level outside of mmu_lock is not correct. Because the
> pages are pinned and the VMM is not malicious, it will function as intended, but
> it is far from correct.
>
Good point, I should have retrieved the pfn and level inside the lock.
> The overall approach also feels wrong, e.g. a guest won't be able to convert a
> 2mb chunk back to a 2mb large page if KVM mapped the GPA as a 4kb page in the
> past (from a different conversion).
>
Maybe I am missing something, I am not able to follow 'guest won't be
able to convert a 2mb chunk back to a 2mb large page'. The page-size
used inside the guest have to relationship with the RMP/NPT page-size.
e.g, a guest can validate the page range as a 4k and still map the page
range as a 2mb or 1gb in its pagetable.
> I'd also strongly prefer to have a common flow between SNP and TDX for converting
> between shared/prviate.
>
> I'll circle back to this next week, it'll probably take a few hours of staring
> to figure out a solution, if a common one for SNP+TDX is even possible.
>
Sounds good.
>> +
>> + switch (op) {
>> + case SNP_PAGE_STATE_SHARED:
>> + rc = snp_make_page_shared(vcpu, gpa, pfn, level);
>> + break;
>> + case SNP_PAGE_STATE_PRIVATE:
>> + rc = snp_make_page_private(vcpu, gpa, pfn, level);
>> + break;
>> + default:
>> + rc = -EINVAL;
>> + break;
>> + }
>> +
>> + write_unlock(&kvm->mmu_lock);
>> +
>> + if (rc) {
>> + pr_err_ratelimited("Error op %d gpa %llx pfn %llx level %d rc %d\n",
>> + op, gpa, pfn, level, rc);
>> + goto out;
>> + }
>> +
>> + gpa = gpa + page_level_size(level);
>> + }
>> +
>> +out:
>> + return rc;
>> +}
>> +
>> static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
>> {
>> struct vmcb_control_area *control = &svm->vmcb->control;
>> @@ -2941,6 +3063,25 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
>> GHCB_MSR_INFO_POS);
>> break;
>> }
>> + case GHCB_MSR_PSC_REQ: {
>> + gfn_t gfn;
>> + int ret;
>> + u8 op;
>> +
>> + gfn = get_ghcb_msr_bits(svm, GHCB_MSR_PSC_GFN_MASK, GHCB_MSR_PSC_GFN_POS);
>> + op = get_ghcb_msr_bits(svm, GHCB_MSR_PSC_OP_MASK, GHCB_MSR_PSC_OP_POS);
>> +
>> + ret = __snp_handle_psc(vcpu, op, gfn_to_gpa(gfn), PG_LEVEL_4K);
>> +
>> + /* If failed to change the state then spec requires to return all F's */
>
> That doesn't mesh with what I could find:
>
> o 0x015 – SNP Page State Change Response
> ▪ GHCBData[63:32] – Error code
> ▪ GHCBData[31:12] – Reserved, must be zero
> Written by the hypervisor in response to a Page State Change request. Any non-
> zero value for the error code indicates that the page state change was not
> successful.
>
> And if "all Fs" is indeed the error code, 'int ret' probably only works by luck
> since the return value is a 64-bit value, where as ret is a 32-bit signed int.
>
>> + if (ret)
>> + ret = -1;
>
> Uh, this is fubar. You've created a shadow of 'ret', i.e. the outer ret is likely
> uninitialized.
>
Ah, let me fix it in next rev.
thanks
Powered by blists - more mailing lists