[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6ea92fe2-4067-d0e0-b716-16d39a7a6065@oracle.com>
Date: Wed, 2 Dec 2020 12:32:11 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: David Woodhouse <dwmw2@...radead.org>,
Joao Martins <joao.m.martins@...cle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH RFC 03/39] KVM: x86/xen: register shared_info page
On 2020-12-02 4:20 a.m., David Woodhouse wrote:
> On Wed, 2020-12-02 at 10:44 +0000, Joao Martins wrote:
>> [late response - was on holiday yesterday]
>>
>> On 12/2/20 12:40 AM, Ankur Arora wrote:
>>> On 2020-12-01 5:07 a.m., David Woodhouse wrote:
>>>> On Wed, 2019-02-20 at 20:15 +0000, Joao Martins wrote:
>>>>> +static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
>>>>> +{
>>>>> + struct shared_info *shared_info;
>>>>> + struct page *page;
>>>>> +
>>>>> + page = gfn_to_page(kvm, gfn);
>>>>> + if (is_error_page(page))
>>>>> + return -EINVAL;
>>>>> +
>>>>> + kvm->arch.xen.shinfo_addr = gfn;
>>>>> +
>>>>> + shared_info = page_to_virt(page);
>>>>> + memset(shared_info, 0, sizeof(struct shared_info));
>>>>> + kvm->arch.xen.shinfo = shared_info;
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>
>>>> Hm.
>>>>
>>>> How come we get to pin the page and directly dereference it every time,
>>>> while kvm_setup_pvclock_page() has to use kvm_write_guest_cached()
>>>> instead?
>>>
>>> So looking at my WIP trees from the time, this is something that
>>> we went back and forth on as well with using just a pinned page or a
>>> persistent kvm_vcpu_map().
>>>
>>> I remember distinguishing shared_info/vcpu_info from kvm_setup_pvclock_page()
>>> as shared_info is created early and is not expected to change during the
>>> lifetime of the guest which didn't seem true for MSR_KVM_SYSTEM_TIME (or
>>> MSR_KVM_STEAL_TIME) so that would either need to do a kvm_vcpu_map()
>>> kvm_vcpu_unmap() dance or do some kind of synchronization.
>>>
>>> That said, I don't think this code explicitly disallows any updates
>>> to shared_info.
>>>
>>>>
>>>> If that was allowed, wouldn't it have been a much simpler fix for
>>>> CVE-2019-3016? What am I missing?
>>>
>>> Agreed.
>>>
>>> Perhaps, Paolo can chime in with why KVM never uses pinned page
>>> and always prefers to do cached mappings instead?
>>>
>>
>> Part of the CVE fix to not use cached versions.
>>
>> It's not a longterm pin of the page unlike we try to do here (partly due to the nature
>> of the pages we are mapping) but we still we map the gpa, RMW the steal time struct, and
>> then unmap the page.
>>
>> See record_steal_time() -- but more specifically commit b043138246 ("x86/KVM: Make sure
>> KVM_VCPU_FLUSH_TLB flag is not missed").
>>
>> But I am not sure it's a good idea to follow the same as record_steal_time() given that
>> this is a fairly sensitive code path for event channels.
>
> Right. We definitely need to use atomic RMW operations (like the CVE
> fix did) so the page needs to be *mapped*.
>
> My question was about a permanent pinned mapping vs the map/unmap as we
> need it that record_steal_time() does.
>
> On IRC, Paolo told me that permanent pinning causes problems for memory
> hotplug, and pointed me at the trick we do with an MMU notifier and
> kvm_vcpu_reload_apic_access_page().
Okay that answers my question. Thanks for clearing that up.
Not sure of a good place to document this but it would be good to
have this written down somewhere. Maybe kvm_map_gfn()?
>
> I'm going to stick with the pinning we have for the moment, and just
> fix up the fact that it leaks the pinned pages if the guest sets the
> shared_info address more than once.
>
> At some point the apic page MMU notifier thing can be made generic, and
> we can use that for this and for KVM steal time too.
>
Yeah, that's something that'll definitely be good to have.
Ankur
Powered by blists - more mailing lists