linux-kernel - Re: [PATCH RFC 03/39] KVM: x86/xen: register shared

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6ea92fe2-4067-d0e0-b716-16d39a7a6065@oracle.com>
Date:   Wed, 2 Dec 2020 12:32:11 -0800
From:   Ankur Arora <ankur.a.arora@...cle.com>
To:     David Woodhouse <dwmw2@...radead.org>,
        Joao Martins <joao.m.martins@...cle.com>
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH RFC 03/39] KVM: x86/xen: register shared_info page



On 2020-12-02 4:20 a.m., David Woodhouse wrote:
> On Wed, 2020-12-02 at 10:44 +0000, Joao Martins wrote:
>> [late response - was on holiday yesterday]
>>
>> On 12/2/20 12:40 AM, Ankur Arora wrote:
>>> On 2020-12-01 5:07 a.m., David Woodhouse wrote:
>>>> On Wed, 2019-02-20 at 20:15 +0000, Joao Martins wrote:
>>>>> +static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
>>>>> +{
>>>>> +       struct shared_info *shared_info;
>>>>> +       struct page *page;
>>>>> +
>>>>> +       page = gfn_to_page(kvm, gfn);
>>>>> +       if (is_error_page(page))
>>>>> +               return -EINVAL;
>>>>> +
>>>>> +       kvm->arch.xen.shinfo_addr = gfn;
>>>>> +
>>>>> +       shared_info = page_to_virt(page);
>>>>> +       memset(shared_info, 0, sizeof(struct shared_info));
>>>>> +       kvm->arch.xen.shinfo = shared_info;
>>>>> +       return 0;
>>>>> +}
>>>>> +
>>>>
>>>> Hm.
>>>>
>>>> How come we get to pin the page and directly dereference it every time,
>>>> while kvm_setup_pvclock_page() has to use kvm_write_guest_cached()
>>>> instead?
>>>
>>> So looking at my WIP trees from the time, this is something that
>>> we went back and forth on as well with using just a pinned page or a
>>> persistent kvm_vcpu_map().
>>>
>>> I remember distinguishing shared_info/vcpu_info from kvm_setup_pvclock_page()
>>> as shared_info is created early and is not expected to change during the
>>> lifetime of the guest which didn't seem true for MSR_KVM_SYSTEM_TIME (or
>>> MSR_KVM_STEAL_TIME) so that would either need to do a kvm_vcpu_map()
>>> kvm_vcpu_unmap() dance or do some kind of synchronization.
>>>
>>> That said, I don't think this code explicitly disallows any updates
>>> to shared_info.
>>>
>>>>
>>>> If that was allowed, wouldn't it have been a much simpler fix for
>>>> CVE-2019-3016? What am I missing?
>>>
>>> Agreed.
>>>
>>> Perhaps, Paolo can chime in with why KVM never uses pinned page
>>> and always prefers to do cached mappings instead?
>>>
>>
>> Part of the CVE fix to not use cached versions.
>>
>> It's not a longterm pin of the page unlike we try to do here (partly due to the nature
>> of the pages we are mapping) but we still we map the gpa, RMW the steal time struct, and
>> then unmap the page.
>>
>> See record_steal_time() -- but more specifically commit b043138246 ("x86/KVM: Make sure
>> KVM_VCPU_FLUSH_TLB flag is not missed").
>>
>> But I am not sure it's a good idea to follow the same as record_steal_time() given that
>> this is a fairly sensitive code path for event channels.
> 
> Right. We definitely need to use atomic RMW operations (like the CVE
> fix did) so the page needs to be *mapped*.
> 
> My question was about a permanent pinned mapping vs the map/unmap as we
> need it that record_steal_time() does.
> 
> On IRC, Paolo told me that permanent pinning causes problems for memory
> hotplug, and pointed me at the trick we do with an MMU notifier and
> kvm_vcpu_reload_apic_access_page().

Okay that answers my question. Thanks for clearing that up.

Not sure of a good place to document this but it would be good to
have this written down somewhere. Maybe kvm_map_gfn()?

> 
> I'm going to stick with the pinning we have for the moment, and just
> fix up the fact that it leaks the pinned pages if the guest sets the
> shared_info address more than once.
> 
> At some point the apic page MMU notifier thing can be made generic, and
> we can use that for this and for KVM steal time too.
> 

Yeah, that's something that'll definitely be good to have.

Ankur