lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b4067e4e-6901-265e-49f8-0142b52f5819@amd.com>
Date:   Wed, 13 Oct 2021 12:49:03 -0500
From:   Brijesh Singh <brijesh.singh@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     brijesh.singh@....com, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        Tom Lendacky <thomas.lendacky@....com>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Michael Roth <michael.roth@....com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Andi Kleen <ak@...ux.intel.com>, tony.luck@...el.com,
        marcorr@...gle.com, sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: Re: [PATCH Part2 v5 37/45] KVM: SVM: Add support to handle MSR based
 Page State Change VMGEXIT


On 10/13/21 10:24 AM, Sean Christopherson wrote:
> On Wed, Oct 13, 2021, Brijesh Singh wrote:
>>> The more I look at this, the more strongly I feel that private <=> shared conversions
>>> belong in the MMU, and that KVM's SPTEs should be the single source of truth for
>>> shared vs. private.  E.g. add a SPTE_TDP_PRIVATE_MASK in the software available bits.
>>> I believe the only hiccup is the snafu where not zapping _all_ SPTEs on memslot
>>> deletion breaks QEMU+VFIO+GPU, i.e. KVM would lose its canonical info on unrelated
>>> memslot deletion.
>>>
>>> But that is a solvable problem.  Ideally the bug, wherever it is, would be root
>>> caused and fixed.  I believe Peter (and Marc?) is going to work on reproducing
>>> the bug.
>> We have been also setting up VM with Qemu + VFIO + GPU usecase to repro
>> the bug on AMD HW and so far we no luck in reproducing it. Will continue
>> stressing the system to recreate it. Lets hope that Peter (and Marc) can
>> easily recreate on Intel HW so that we can work towards fixing it.
> Are you trying on a modern kernel?  If so, double check that nx_huge_pages is off,
> turning that on caused the bug to disappear.  It should be off for AMD systems,
> but it's worth checking.

Yes, this is a recent kernel. I will double check the nx_huge_pages is off.


>>>> +		if (!rc) {
>>>> +			/*
>>>> +			 * This may happen if another vCPU unmapped the page
>>>> +			 * before we acquire the lock. Retry the PSC.
>>>> +			 */
>>>> +			write_unlock(&kvm->mmu_lock);
>>>> +			return 0;
>>> How will the caller (guest?) know to retry the PSC if KVM returns "success"?
>> If a guest is adhering to the GHCB spec then it will see that hypervisor
>> has not processed all the entry and it should retry the PSC.
> But AFAICT that information isn't passed to the guest.  Even in this single-page
> MSR-based case, the caller will say "all good" on a return of 0.
>
> The "full" path is more obvious, as the caller clearly continues to process
> entries unless there's an actual failure.
>
> +       for (; cur <= end; cur++) {
> +               entry = &info->entries[cur];
> +               gpa = gfn_to_gpa(entry->gfn);
> +               level = RMP_TO_X86_PG_LEVEL(entry->pagesize);
> +               op = entry->operation;
> +
> +               if (!IS_ALIGNED(gpa, page_level_size(level))) {
> +                       rc = PSC_INVALID_ENTRY;
> +                       goto out;
> +               }
> +
> +               rc = __snp_handle_page_state_change(vcpu, op, gpa, level);
> +               if (rc)
> +                       goto out;
> +       }
>
Please see the guest kernel patch #19 [1]. In spec there is no special
code for the retry. The guest will look at the PSC hdr to determine how
many entries were processed by the hypervisor (in this particular case a
0). And at time the guest can do whatever it wants. In the case of Linux
guest, we retry the PSC.

[1]
https://lore.kernel.org/linux-mm/20211008180453.462291-20-brijesh.singh@amd.com/

thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ