linux-kernel - Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210302151516.GA30137@ashkalra_ubuntu_server>
Date:   Tue, 2 Mar 2021 15:15:16 +0000
From:   Ashish Kalra <ashish.kalra@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Steve Rutherford <srutherford@...gle.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "joro@...tes.org" <joro@...tes.org>,
        "Lendacky, Thomas" <Thomas.Lendacky@....com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "venu.busireddy@...cle.com" <venu.busireddy@...cle.com>,
        "Singh, Brijesh" <brijesh.singh@....com>,
        Will Deacon <will@...nel.org>,
        Quentin Perret <qperret@...gle.com>
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST
 ioctl

On Tue, Mar 02, 2021 at 02:55:43PM +0000, Ashish Kalra wrote:
> On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote:
> > +Will and Quentin (arm64)
> > 
> > Moving the non-KVM x86 folks to bcc, I don't they care about KVM details at this
> > point.
> > 
> > On Fri, Feb 26, 2021, Ashish Kalra wrote:
> > > On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
> > > > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@....com> wrote:
> > > > Thanks for grabbing the data!
> > > > 
> > > > I am fine with both paths. Sean has stated an explicit desire for
> > > > hypercall exiting, so I think that would be the current consensus.
> > 
> > Yep, though it'd be good to get Paolo's input, too.
> > 
> > > > If we want to do hypercall exiting, this should be in a follow-up
> > > > series where we implement something more generic, e.g. a hypercall
> > > > exiting bitmap or hypercall exit list. If we are taking the hypercall
> > > > exit route, we can drop the kvm side of the hypercall.
> > 
> > I don't think this is a good candidate for arbitrary hypercall interception.  Or
> > rather, I think hypercall interception should be an orthogonal implementation.
> > 
> > The guest, including guest firmware, needs to be aware that the hypercall is
> > supported, and the ABI needs to be well-defined.  Relying on userspace VMMs to
> > implement a common ABI is an unnecessary risk.
> > 
> > We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but
> > require further VMM intervention.  But, I just don't see the point, it would
> > save only a few lines of code.  It would also limit what KVM could do in the
> > future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace,
> > then mandatory interception would essentially make it impossible for KVM to do
> > bookkeeping while still honoring the interception request.
> > 
> > However, I do think it would make sense to have the userspace exit be a generic
> > exit type.  But hey, we already have the necessary ABI defined for that!  It's
> > just not used anywhere.
> > 
> > 	/* KVM_EXIT_HYPERCALL */
> > 	struct {
> > 		__u64 nr;
> > 		__u64 args[6];
> > 		__u64 ret;
> > 		__u32 longmode;
> > 		__u32 pad;
> > 	} hypercall;
> > 
> > 
> > > > Userspace could also handle the MSR using MSR filters (would need to
> > > > confirm that).  Then userspace could also be in control of the cpuid bit.
> > 
> > An MSR is not a great fit; it's x86 specific and limited to 64 bits of data.
> > The data limitation could be fudged by shoving data into non-standard GPRs, but
> > that will result in truly heinous guest code, and extensibility issues.
> > 
> > The data limitation is a moot point, because the x86-only thing is a deal
> > breaker.  arm64's pKVM work has a near-identical use case for a guest to share
> > memory with a host.  I can't think of a clever way to avoid having to support
> > TDX's and SNP's hypervisor-agnostic variants, but we can at least not have
> > multiple KVM variants.
> 
> Looking at arm64's pKVM work, i see that it is a recently introduced RFC
> patch-set and probably relevant to arm64 nVHE hypervisor
> mode/implementation, and potentially makes sense as it adds guest
> memory protection as both host and guest kernels are running on the same
> privilege level ?
> 
> Though i do see that the pKVM stuff adds two hypercalls, specifically :
> 
> pkvm_create_mappings() ( I assume this is for setting shared memory
> regions between host and guest) &
> pkvm_create_private_mappings().
> 
> And the use-cases are quite similar to memory protection architectues
> use cases, for example, use with virtio devices, guest DMA I/O, etc.
> 
> But, isn't this patch set still RFC, and though i agree that it adds
> an infrastructure for standardised communication between the host and
> it's guests for mutually controlled shared memory regions and
> surely adds some kind of portability between hypervisor
> implementations, but nothing is standardised still, right ?
> 

And to add here, the hypercall implementation is in-HYP mode,
there is no infrastructure as part of this patch-set to do
hypercall exiting and handling it in user-space. 

Though arguably, we may able to add a hypercall exiting code path on the
amd64 implementation for the same hypercall interfaces ?

Alternatively, we implement this in-kernel and then add SET/GET ioctl
interfaces to export the shared pages/regions list to user-space.

Thanks,
Ashish