linux-kernel - Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <373DF203-88D9-4501-AC0F-CB7D191050B1@amd.com>
Date:   Tue, 8 Dec 2020 05:18:39 +0000
From:   "Kalra, Ashish" <Ashish.Kalra@....com>
To:     Steve Rutherford <srutherford@...gle.com>
CC:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Joerg Roedel <joro@...tes.org>,
        Borislav Petkov <bp@...e.de>,
        "Lendacky, Thomas" <Thomas.Lendacky@....com>,
        X86 ML <x86@...nel.org>, KVM list <kvm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Singh, Brijesh" <brijesh.singh@....com>,
        "dovmurik@...ux.vnet.ibm.com" <dovmurik@...ux.vnet.ibm.com>,
        "tobin@....com" <tobin@....com>,
        "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
        "frankeh@...ibm.com" <frankeh@...ibm.com>,
        "dgilbert@...hat.com" <dgilbert@...hat.com>
Subject: Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3


> 
>> I suspect a list
>> would consume far less memory, hopefully without impacting performance.

And how much host memory are we talking about for here, say for a 4gb guest, the bitmap will be using just using something like 128k+.

Thanks,
Ashish

> On Dec 7, 2020, at 10:16 PM, Kalra, Ashish <Ashish.Kalra@....com> wrote:
> 
> I don’t think that the bitmap by itself is really a performance bottleneck here.
> 
> Thanks,
> Ashish
> 
>>> On Dec 7, 2020, at 9:10 PM, Steve Rutherford <srutherford@...gle.com> wrote:
>>> On Mon, Dec 7, 2020 at 12:42 PM Sean Christopherson <seanjc@...gle.com> wrote:
>>>> On Sun, Dec 06, 2020, Paolo Bonzini wrote:
>>>> On 03/12/20 01:34, Sean Christopherson wrote:
>>>>> On Tue, Dec 01, 2020, Ashish Kalra wrote:
>>>>>> From: Brijesh Singh <brijesh.singh@....com>
>>>>>> KVM hypercall framework relies on alternative framework to patch the
>>>>>> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
>>>>>> apply_alternative() is called then it defaults to VMCALL. The approach
>>>>>> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
>>>>>> will be able to decode the instruction and do the right things. But
>>>>>> when SEV is active, guest memory is encrypted with guest key and
>>>>>> hypervisor will not be able to decode the instruction bytes.
>>>>>> Add SEV specific hypercall3, it unconditionally uses VMMCALL. The hypercall
>>>>>> will be used by the SEV guest to notify encrypted pages to the hypervisor.
>>>>> What if we invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to VMMCALL
>>>>> and opt into VMCALL?  It's a synthetic feature flag either way, and I don't
>>>>> think there are any existing KVM hypercalls that happen before alternatives are
>>>>> patched, i.e. it'll be a nop for sane kernel builds.
>>>>> I'm also skeptical that a KVM specific hypercall is the right approach for the
>>>>> encryption behavior, but I'll take that up in the patches later in the series.
>>>> Do you think that it's the guest that should "donate" memory for the bitmap
>>>> instead?
>>> No.  Two things I'd like to explore:
>>> 1. Making the hypercall to announce/request private vs. shared common across
>>>   hypervisors (KVM, Hyper-V, VMware, etc...) and technologies (SEV-* and TDX).
>>>   I'm concerned that we'll end up with multiple hypercalls that do more or
>>>   less the same thing, e.g. KVM+SEV, Hyper-V+SEV, TDX, etc...  Maybe it's a
>>>   pipe dream, but I'd like to at least explore options before shoving in KVM-
>>>   only hypercalls.
>>> 2. Tracking shared memory via a list of ranges instead of a using bitmap to
>>>   track all of guest memory.  For most use cases, the vast majority of guest
>>>   memory will be private, most ranges will be 2mb+, and conversions between
>>>   private and shared will be uncommon events, i.e. the overhead to walk and
>>>   split/merge list entries is hopefully not a big concern.  I suspect a list
>>>   would consume far less memory, hopefully without impacting performance.
>> For a fancier data structure, I'd suggest an interval tree. Linux
>> already has an rbtree-based interval tree implementation, which would
>> likely work, and would probably assuage any performance concerns.
>> Something like this would not be worth doing unless most of the shared
>> pages were physically contiguous. A sample Ubuntu 20.04 VM on GCP had
>> 60ish discontiguous shared regions. This is by no means a thorough
>> search, but it's suggestive. If this is typical, then the bitmap would
>> be far less efficient than most any interval-based data structure.
>> You'd have to allow userspace to upper bound the number of intervals
>> (similar to the maximum bitmap size), to prevent host OOMs due to
>> malicious guests. There's something nice about the guest donating
>> memory for this, since that would eliminate the OOM risk.