lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 31 Jul 2017 11:54:42 -0500
From:   Brijesh Singh <brijesh.singh@....com>
To:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org
Cc:     brijesh.singh@....com, thomas.lendacky@....com, rkrcmar@...hat.com,
        joro@...tes.org, x86@...nel.org, linux-kernel@...r.kernel.org,
        mingo@...hat.com, hpa@...or.com, tglx@...utronix.de, bp@...e.de
Subject: Re: [PATCH v2 1/3] kvm: svm: Add support for additional SVM NPF error
 codes


On 07/31/2017 10:44 AM, Paolo Bonzini wrote:
> On 31/07/2017 15:30, Brijesh Singh wrote:
>> Hi Paolo,
>>
>> On 07/27/2017 11:27 AM, Paolo Bonzini wrote:
>>> On 23/11/2016 18:01, Brijesh Singh wrote:
>>>>    +    /*
>>>> +     * Before emulating the instruction, check if the error code
>>>> +     * was due to a RO violation while translating the guest page.
>>>> +     * This can occur when using nested virtualization with nested
>>>> +     * paging in both guests. If true, we simply unprotect the page
>>>> +     * and resume the guest.
>>>> +     *
>>>> +     * Note: AMD only (since it supports the PFERR_GUEST_PAGE_MASK used
>>>> +     *       in PFERR_NEXT_GUEST_PAGE)
>>>> +     */
>>>> +    if (error_code == PFERR_NESTED_GUEST_PAGE) {
>>>> +        kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2));
>>>> +        return 1;
>>>> +    }
>>>
>>>
>>> What happens if L1 is mapping some memory that is read only in L0?  That
>>> is, the L1 nested page tables make it read-write, but the L0 shadow
>>> nested page tables make it read-only.
>>>
>>> Accessing it would cause an NPF, and then my guess is that the L1 guest
>>> would loop on the failing instruction instead of just dropping the write.
>>>
>>
>>
>> Not sure if I am able to follow your use case. Could you please explain me
>> in bit detail.
>>
>> The purpose of the code above was really for when we resume from the L2 guest
>> back to the L1 guest. The L1 page tables are marked RO when in the L2 guest
>> (for shadow paging) as I recall, so when we come back to the L1 guest, it can
>> get a fault since its page tables are not marked writeable at L0 as they
>> need to be.
> 
> There can be different cases where an L0->L2 shadow nested page table is
> marked read only, in particular when a page is read only in L1's nested
> page tables.  If such a page is accessed by L2 while walking page tables
> it will cause a nested page fault (page table walks are write accesses).
>   However, after kvm_mmu_unprotect_page you will get another page fault,
> and again in an endless stream.
> 
> Instead, emulation would have caused a nested page fault vmexit, I think.
> 

If possible could you please give me some pointer on how to create this use
case so that we can get definitive answer.

Looking at the code path is giving me indication that the new code
(the kvm_mmu_unprotect_page call) only happens if vcpu->arch.mmu_page_fault()
returns an indication that the instruction should be emulated. I would not expect
that to be the case scenario you described above since L1 making a page read-only
(this is a page table for L2) is an error and should result in #NPF being injected
into L1. It's bit hard for me to visualize the code flow and figure out exactly
how that would happen, but I just tried booting nested virtualization and it seem
to be working okay.

Is there a kvm-unit-test which I can run to trigger this scenario ? thanks

-Brijesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ