linux-kernel - Re: [PATCH RFC 03/39] KVM: x86/xen: register shared

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d3775e15-9d5e-3746-84a0-a3049d20c3eb@oracle.com>
Date:   Wed, 2 Dec 2020 12:33:43 -0800
From:   Ankur Arora <ankur.a.arora@...cle.com>
To:     Joao Martins <joao.m.martins@...cle.com>,
        David Woodhouse <dwmw2@...radead.org>
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH RFC 03/39] KVM: x86/xen: register shared_info page

On 2020-12-02 2:44 a.m., Joao Martins wrote:
> [late response - was on holiday yesterday]
> 
> On 12/2/20 12:40 AM, Ankur Arora wrote:
>> On 2020-12-01 5:07 a.m., David Woodhouse wrote:
>>> On Wed, 2019-02-20 at 20:15 +0000, Joao Martins wrote:
>>>> +static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
>>>> +{
>>>> +       struct shared_info *shared_info;
>>>> +       struct page *page;
>>>> +
>>>> +       page = gfn_to_page(kvm, gfn);
>>>> +       if (is_error_page(page))
>>>> +               return -EINVAL;
>>>> +
>>>> +       kvm->arch.xen.shinfo_addr = gfn;
>>>> +
>>>> +       shared_info = page_to_virt(page);
>>>> +       memset(shared_info, 0, sizeof(struct shared_info));
>>>> +       kvm->arch.xen.shinfo = shared_info;
>>>> +       return 0;
>>>> +}
>>>> +
>>>
>>> Hm.
>>>
>>> How come we get to pin the page and directly dereference it every time,
>>> while kvm_setup_pvclock_page() has to use kvm_write_guest_cached()
>>> instead?
>>
>> So looking at my WIP trees from the time, this is something that
>> we went back and forth on as well with using just a pinned page or a
>> persistent kvm_vcpu_map().
>>
>> I remember distinguishing shared_info/vcpu_info from kvm_setup_pvclock_page()
>> as shared_info is created early and is not expected to change during the
>> lifetime of the guest which didn't seem true for MSR_KVM_SYSTEM_TIME (or
>> MSR_KVM_STEAL_TIME) so that would either need to do a kvm_vcpu_map()
>> kvm_vcpu_unmap() dance or do some kind of synchronization.
>>
>> That said, I don't think this code explicitly disallows any updates
>> to shared_info.
>>
>>>
>>> If that was allowed, wouldn't it have been a much simpler fix for
>>> CVE-2019-3016? What am I missing?
>>
>> Agreed.
>>
>> Perhaps, Paolo can chime in with why KVM never uses pinned page
>> and always prefers to do cached mappings instead?
>>
> Part of the CVE fix to not use cached versions.
> 
> It's not a longterm pin of the page unlike we try to do here (partly due to the nature
> of the pages we are mapping) but we still we map the gpa, RMW the steal time struct, and
> then unmap the page.
> 
> See record_steal_time() -- but more specifically commit b043138246 ("x86/KVM: Make sure
> KVM_VCPU_FLUSH_TLB flag is not missed").
> 
> But I am not sure it's a good idea to follow the same as record_steal_time() given that
> this is a fairly sensitive code path for event channels.
> 
>>>
>>> Should I rework these to use kvm_write_guest_cached()?
>>
>> kvm_vcpu_map() would be better. The event channel logic does RMW operations
>> on shared_info->vcpu_info.
>>
> Indeed, yes.
> 
> Ankur IIRC, we saw missed event channels notifications when we were using the
> {write,read}_cached() version of the patch.
> 
> But I can't remember the reason it was due to, either the evtchn_pending or the mask
> word -- which would make it not inject an upcall.

If memory serves, it was the mask. Though I don't think that we had
kvm_{write,read}_cached in use at that point -- given that they were
definitely not RMW safe.


Ankur

> 
> 	Joao
>