lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM5PR12MB1386C01E72A71F3AB6EB1F068EDE0@DM5PR12MB1386.namprd12.prod.outlook.com>
Date:   Fri, 10 Apr 2020 20:55:13 +0000
From:   "Kalra, Ashish" <Ashish.Kalra@....com>
To:     Steve Rutherford <srutherford@...gle.com>
CC:     Krish Sadhukhan <krish.sadhukhan@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Joerg Roedel <joro@...tes.org>,
        Borislav Petkov <bp@...e.de>,
        "Lendacky, Thomas" <Thomas.Lendacky@....com>,
        X86 ML <x86@...nel.org>, KVM list <kvm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        David Rientjes <rientjes@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        "Singh, Brijesh" <brijesh.singh@....com>
Subject: RE: [PATCH v6 12/14] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET
 ioctl

[AMD Official Use Only - Internal Distribution Only]

Hello Steve,

-----Original Message-----
From: Steve Rutherford <srutherford@...gle.com> 
Sent: Friday, April 10, 2020 3:19 PM
To: Kalra, Ashish <Ashish.Kalra@....com>
Cc: Krish Sadhukhan <krish.sadhukhan@...cle.com>; Paolo Bonzini <pbonzini@...hat.com>; Thomas Gleixner <tglx@...utronix.de>; Ingo Molnar <mingo@...hat.com>; H. Peter Anvin <hpa@...or.com>; Joerg Roedel <joro@...tes.org>; Borislav Petkov <bp@...e.de>; Lendacky, Thomas <Thomas.Lendacky@....com>; X86 ML <x86@...nel.org>; KVM list <kvm@...r.kernel.org>; LKML <linux-kernel@...r.kernel.org>; David Rientjes <rientjes@...gle.com>; Andy Lutomirski <luto@...nel.org>; Singh, Brijesh <brijesh.singh@....com>
Subject: Re: [PATCH v6 12/14] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl

On Fri, Apr 10, 2020 at 1:16 PM Steve Rutherford <srutherford@...gle.com> wrote:
>
> On Fri, Apr 10, 2020 at 11:14 AM Steve Rutherford 
> <srutherford@...gle.com> wrote:
> >
> > On Thu, Apr 9, 2020 at 6:34 PM Ashish Kalra <ashish.kalra@....com> wrote:
> > >
> > > Hello Steve,
> > >
> > > On Thu, Apr 09, 2020 at 05:59:56PM -0700, Steve Rutherford wrote:
> > > > On Tue, Apr 7, 2020 at 6:52 PM Ashish Kalra <ashish.kalra@....com> wrote:
> > > > >
> > > > > Hello Steve,
> > > > >
> > > > > On Tue, Apr 07, 2020 at 06:25:51PM -0700, Steve Rutherford wrote:
> > > > > > On Mon, Apr 6, 2020 at 11:53 AM Krish Sadhukhan 
> > > > > > <krish.sadhukhan@...cle.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 4/3/20 2:45 PM, Ashish Kalra wrote:
> > > > > > > > On Fri, Apr 03, 2020 at 02:14:23PM -0700, Krish Sadhukhan wrote:
> > > > > > > >> On 3/29/20 11:23 PM, Ashish Kalra wrote:
> > > > > > > >>> From: Ashish Kalra <ashish.kalra@....com>
> > > > > > > >>>
> > > > > > > >>> This ioctl can be used by the application to reset the 
> > > > > > > >>> page encryption bitmap managed by the KVM driver. A 
> > > > > > > >>> typical usage for this ioctl is on VM reboot, on 
> > > > > > > >>> reboot, we must reinitialize the bitmap.
> > > > > > > >>>
> > > > > > > >>> Signed-off-by: Ashish Kalra <ashish.kalra@....com>
> > > > > > > >>> ---
> > > > > > > >>>    Documentation/virt/kvm/api.rst  | 13 +++++++++++++
> > > > > > > >>>    arch/x86/include/asm/kvm_host.h |  1 +
> > > > > > > >>>    arch/x86/kvm/svm.c              | 16 ++++++++++++++++
> > > > > > > >>>    arch/x86/kvm/x86.c              |  6 ++++++
> > > > > > > >>>    include/uapi/linux/kvm.h        |  1 +
> > > > > > > >>>    5 files changed, 37 insertions(+)
> > > > > > > >>>
> > > > > > > >>> diff --git a/Documentation/virt/kvm/api.rst 
> > > > > > > >>> b/Documentation/virt/kvm/api.rst index 
> > > > > > > >>> 4d1004a154f6..a11326ccc51d 100644
> > > > > > > >>> --- a/Documentation/virt/kvm/api.rst
> > > > > > > >>> +++ b/Documentation/virt/kvm/api.rst
> > > > > > > >>> @@ -4698,6 +4698,19 @@ During the guest live migration the outgoing guest exports its page encryption
> > > > > > > >>>    bitmap, the KVM_SET_PAGE_ENC_BITMAP can be used to build the page encryption
> > > > > > > >>>    bitmap for an incoming guest.
> > > > > > > >>> +4.127 KVM_PAGE_ENC_BITMAP_RESET (vm ioctl)
> > > > > > > >>> +-----------------------------------------
> > > > > > > >>> +
> > > > > > > >>> +:Capability: basic
> > > > > > > >>> +:Architectures: x86
> > > > > > > >>> +:Type: vm ioctl
> > > > > > > >>> +:Parameters: none
> > > > > > > >>> +:Returns: 0 on success, -1 on error
> > > > > > > >>> +
> > > > > > > >>> +The KVM_PAGE_ENC_BITMAP_RESET is used to reset the 
> > > > > > > >>> +guest's page encryption bitmap during guest reboot and this is only done on the guest's boot vCPU.
> > > > > > > >>> +
> > > > > > > >>> +
> > > > > > > >>>    5. The kvm_run structure
> > > > > > > >>>    ======================== diff --git 
> > > > > > > >>> a/arch/x86/include/asm/kvm_host.h 
> > > > > > > >>> b/arch/x86/include/asm/kvm_host.h index 
> > > > > > > >>> d30f770aaaea..a96ef6338cd2 100644
> > > > > > > >>> --- a/arch/x86/include/asm/kvm_host.h
> > > > > > > >>> +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > > >>> @@ -1273,6 +1273,7 @@ struct kvm_x86_ops {
> > > > > > > >>>                             struct kvm_page_enc_bitmap *bmap);
> > > > > > > >>>     int (*set_page_enc_bitmap)(struct kvm *kvm,
> > > > > > > >>>                             struct kvm_page_enc_bitmap 
> > > > > > > >>> *bmap);
> > > > > > > >>> +   int (*reset_page_enc_bitmap)(struct kvm *kvm);
> > > > > > > >>>    };
> > > > > > > >>>    struct kvm_arch_async_pf { diff --git 
> > > > > > > >>> a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 
> > > > > > > >>> 313343a43045..c99b0207a443 100644
> > > > > > > >>> --- a/arch/x86/kvm/svm.c
> > > > > > > >>> +++ b/arch/x86/kvm/svm.c
> > > > > > > >>> @@ -7797,6 +7797,21 @@ static int svm_set_page_enc_bitmap(struct kvm *kvm,
> > > > > > > >>>     return ret;
> > > > > > > >>>    }
> > > > > > > >>> +static int svm_reset_page_enc_bitmap(struct kvm *kvm) 
> > > > > > > >>> +{
> > > > > > > >>> +   struct kvm_sev_info *sev = 
> > > > > > > >>> +&to_kvm_svm(kvm)->sev_info;
> > > > > > > >>> +
> > > > > > > >>> +   if (!sev_guest(kvm))
> > > > > > > >>> +           return -ENOTTY;
> > > > > > > >>> +
> > > > > > > >>> +   mutex_lock(&kvm->lock);
> > > > > > > >>> +   /* by default all pages should be marked encrypted */
> > > > > > > >>> +   if (sev->page_enc_bmap_size)
> > > > > > > >>> +           bitmap_fill(sev->page_enc_bmap, sev->page_enc_bmap_size);
> > > > > > > >>> +   mutex_unlock(&kvm->lock);
> > > > > > > >>> +   return 0;
> > > > > > > >>> +}
> > > > > > > >>> +
> > > > > > > >>>    static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> > > > > > > >>>    {
> > > > > > > >>>     struct kvm_sev_cmd sev_cmd; @@ -8203,6 +8218,7 @@ 
> > > > > > > >>> static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
> > > > > > > >>>     .page_enc_status_hc = svm_page_enc_status_hc,
> > > > > > > >>>     .get_page_enc_bitmap = svm_get_page_enc_bitmap,
> > > > > > > >>>     .set_page_enc_bitmap = svm_set_page_enc_bitmap,
> > > > > > > >>> +   .reset_page_enc_bitmap = 
> > > > > > > >>> + svm_reset_page_enc_bitmap,
> > > > > > > >>
> > > > > > > >> We don't need to initialize the intel ops to NULL ? 
> > > > > > > >> It's not initialized in the previous patch either.
> > > > > > > >>
> > > > > > > >>>    };
> > > > > > > > This struct is declared as "static storage", so won't 
> > > > > > > > the non-initialized members be 0 ?
> > > > > > >
> > > > > > >
> > > > > > > Correct. Although, I see that 'nested_enable_evmcs' is 
> > > > > > > explicitly initialized. We should maintain the convention, perhaps.
> > > > > > >
> > > > > > > >
> > > > > > > >>>    static int __init svm_init(void) diff --git 
> > > > > > > >>> a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 
> > > > > > > >>> 05e953b2ec61..2127ed937f53 100644
> > > > > > > >>> --- a/arch/x86/kvm/x86.c
> > > > > > > >>> +++ b/arch/x86/kvm/x86.c
> > > > > > > >>> @@ -5250,6 +5250,12 @@ long kvm_arch_vm_ioctl(struct file *filp,
> > > > > > > >>>                     r = kvm_x86_ops->set_page_enc_bitmap(kvm, &bitmap);
> > > > > > > >>>             break;
> > > > > > > >>>     }
> > > > > > > >>> +   case KVM_PAGE_ENC_BITMAP_RESET: {
> > > > > > > >>> +           r = -ENOTTY;
> > > > > > > >>> +           if (kvm_x86_ops->reset_page_enc_bitmap)
> > > > > > > >>> +                   r = kvm_x86_ops->reset_page_enc_bitmap(kvm);
> > > > > > > >>> +           break;
> > > > > > > >>> +   }
> > > > > > > >>>     default:
> > > > > > > >>>             r = -ENOTTY;
> > > > > > > >>>     }
> > > > > > > >>> diff --git a/include/uapi/linux/kvm.h 
> > > > > > > >>> b/include/uapi/linux/kvm.h index 
> > > > > > > >>> b4b01d47e568..0884a581fc37 100644
> > > > > > > >>> --- a/include/uapi/linux/kvm.h
> > > > > > > >>> +++ b/include/uapi/linux/kvm.h
> > > > > > > >>> @@ -1490,6 +1490,7 @@ struct kvm_enc_region {
> > > > > > > >>>    #define KVM_GET_PAGE_ENC_BITMAP  _IOW(KVMIO, 0xc5, struct kvm_page_enc_bitmap)
> > > > > > > >>>    #define KVM_SET_PAGE_ENC_BITMAP  _IOW(KVMIO, 0xc6, 
> > > > > > > >>> struct kvm_page_enc_bitmap)
> > > > > > > >>> +#define KVM_PAGE_ENC_BITMAP_RESET  _IO(KVMIO, 0xc7)
> > > > > > > >>>    /* Secure Encrypted Virtualization command */
> > > > > > > >>>    enum sev_cmd_id {
> > > > > > > >> Reviewed-by: Krish Sadhukhan 
> > > > > > > >> <krish.sadhukhan@...cle.com>
> > > > > >
> > > > > >
> > > > > > Doesn't this overlap with the set ioctl? Yes, obviously, you 
> > > > > > have to copy the new value down and do a bit more work, but 
> > > > > > I don't think resetting the bitmap is going to be the 
> > > > > > bottleneck on reboot. Seems excessive to add another ioctl for this.
> > > > >
> > > > > The set ioctl is generally available/provided for the incoming 
> > > > > VM to setup the page encryption bitmap, this reset ioctl is 
> > > > > meant for the source VM as a simple interface to reset the whole page encryption bitmap.
> > > > >
> > > > > Thanks,
> > > > > Ashish
> > > >
> > > >
> > > > Hey Ashish,
> > > >
> > > > These seem very overlapping. I think this API should be refactored a bit.
> > > >
> > > > 1) Use kvm_vm_ioctl_enable_cap to control whether or not this 
> > > > hypercall (and related feature bit) is offered to the VM, and 
> > > > also the size of the buffer.
> > >
> > > If you look at patch 13/14, i have added a new kvm para feature 
> > > called "KVM_FEATURE_SEV_LIVE_MIGRATION" which indicates host 
> > > support for SEV Live Migration and a new Custom MSR which the 
> > > guest does a wrmsr to enable the Live Migration feature, so this 
> > > is like the enable cap support.
> > >
> > > There are further extensions to this support i am adding, so patch 
> > > 13/14 of this patch-set is still being enhanced and will have full 
> > > support when i repost next.
> > >
> > > > 2) Use set for manipulating values in the bitmap, including 
> > > > resetting the bitmap. Set the bitmap pointer to null if you want 
> > > > to reset to all 0xFFs. When the bitmap pointer is set, it should 
> > > > set the values to exactly what is pointed at, instead of only 
> > > > clearing bits, as is done currently.
> > >
> > > As i mentioned in my earlier email, the set api is supposed to be 
> > > for the incoming VM, but if you really need to use it for the 
> > > outgoing VM then it can be modified.
> > >
> > > > 3) Use get for fetching values from the kernel. Personally, I'd 
> > > > require alignment of the base GFN to a multiple of 8 (but the 
> > > > number of pages could be whatever), so you can just use a 
> > > > memcpy. Optionally, you may want some way to tell userspace the 
> > > > size of the existing buffer, so it can ensure that it can ask 
> > > > for the entire buffer without having to track the size in 
> > > > usermode (not strictly necessary, but nice to have since it 
> > > > ensures that there is only one place that has to manage this value).
> > > >
> > > > If you want to expand or contract the bitmap, you can use enable 
> > > > cap to adjust the size.
> > >
> > > As being discussed on the earlier mail thread, we are doing this 
> > > dynamically now by computing the guest RAM size when the 
> > > set_user_memory_region ioctl is invoked. I believe that should 
> > > handle the hot-plug and hot-unplug events too, as any hot memory 
> > > updates will need KVM memslots to be updated.
> > Ahh, sorry, forgot you mentioned this: yes this can work. Host needs 
> > to be able to decide not to allocate, but this should be workable.
> > >
> > > > If you don't want to offer the hypercall to the guest, don't 
> > > > call the enable cap.
> > > > This API avoids using up another ioctl. Ioctl space is somewhat 
> > > > scarce. It also gives userspace fine grained control over the 
> > > > buffer, so it can support both hot-plug and hot-unplug (or at 
> > > > the very least it is not obviously incompatible with those). It 
> > > > also gives userspace control over whether or not the feature is 
> > > > offered. The hypercall isn't free, and being able to tell guests 
> > > > to not call when the host wasn't going to migrate it anyway will be useful.
> > > >
> > >
> > > As i mentioned above, now the host indicates if it supports the 
> > > Live Migration feature and the feature and the hypercall are only 
> > > enabled on the host when the guest checks for this support and 
> > > does a wrmsr() to enable the feature. Also the guest will not make 
> > > the hypercall if the host does not indicate support for it.
> > If my read of those patches was correct, the host will always 
> > advertise support for the hypercall. And the only bit controlling 
> > whether or not the hypercall is advertised is essentially the kernel 
> > version. You need to rollout a new kernel to disable the hypercall.
>
> Ahh, awesome, I see I misunderstood how the CPUID bits get passed
> through: usermode can still override them. Forgot about the back and 
> forth for CPUID with usermode. My point about informing the guest 
> kernel is clearly moot. The host still needs the ability to prevent 
> allocations, but that is more minor. Maybe use a flag on the memslots 
> directly?
> On second thought: burning the memslot flag for 30mb per tb of VM seems like a waste.

Currently, I am still using the approach of a "unified" page encryption bitmap instead of a 
bitmap per memslot, with the main change being that the resizing is only done whenever
there are any updates in memslots, when memslots are updated using the
kvm_arch_commit_memory_region() interface.  

Thanks,
Ashish

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ