[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221104082843.GA4142342@chaop.bj.intel.com>
Date: Fri, 4 Nov 2022 16:28:43 +0800
From: Chao Peng <chao.p.peng@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
linux-doc@...r.kernel.org, qemu-devel@...gnu.org,
Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Hugh Dickins <hughd@...gle.com>,
Jeff Layton <jlayton@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Shuah Khan <shuah@...nel.org>, Mike Rapoport <rppt@...nel.org>,
Steven Price <steven.price@....com>,
"Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
Vlastimil Babka <vbabka@...e.cz>,
Vishal Annapurve <vannapurve@...gle.com>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
luto@...nel.org, jun.nakajima@...el.com, dave.hansen@...el.com,
ak@...ux.intel.com, david@...hat.com, aarcange@...hat.com,
ddutile@...hat.com, dhildenb@...hat.com,
Quentin Perret <qperret@...gle.com>, tabba@...gle.com,
Michael Roth <michael.roth@....com>, mhocko@...e.com,
Muchun Song <songmuchun@...edance.com>, wei.w.wang@...el.com
Subject: Re: [PATCH v9 5/8] KVM: Register/unregister the guest private memory
regions
On Thu, Nov 03, 2022 at 11:04:53PM +0000, Sean Christopherson wrote:
> On Tue, Oct 25, 2022, Chao Peng wrote:
> > @@ -4708,6 +4802,24 @@ static long kvm_vm_ioctl(struct file *filp,
> > r = kvm_vm_ioctl_set_memory_region(kvm, &mem);
> > break;
> > }
> > +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM
> > + case KVM_MEMORY_ENCRYPT_REG_REGION:
> > + case KVM_MEMORY_ENCRYPT_UNREG_REGION: {
>
> I'm having second thoughts about usurping KVM_MEMORY_ENCRYPT_(UN)REG_REGION. Aside
> from the fact that restricted/protected memory may not be encrypted, there are
> other potential use cases for per-page memory attributes[*], e.g. to make memory
> read-only (or no-exec, or exec-only, etc...) without having to modify memslots.
>
> Any paravirt use case where the attributes of a page are effectively dictated by
> the guest is going to run into the exact same performance problems with memslots,
> which isn't suprising in hindsight since shared vs. private is really just an
> attribute, albeit with extra special semantics.
>
> And if we go with a brand new ioctl(), maybe someday in the very distant future
> we can deprecate and delete KVM_MEMORY_ENCRYPT_(UN)REG_REGION.
>
> Switching to a new ioctl() should be a minor change, i.e. shouldn't throw too big
> of a wrench into things.
>
> Something like:
>
> KVM_SET_MEMORY_ATTRIBUTES
>
> struct kvm_memory_attributes {
> __u64 address;
> __u64 size;
> __u64 flags;
> }
I like the idea of adding a new ioctl(). But putting all attributes into
a flags in uAPI sounds not good to me, e.g. forcing userspace to set all
attributes in one call can cause pain for userspace, probably for KVM
implementation as well. For private<->shared memory conversion, we
actually only care the KVM_MEM_ATTR_SHARED or KVM_MEM_ATTR_PRIVATE bit,
but we force userspace to set other irrelevant bits as well if use this
API.
I looked at kvm_device_attr, sounds we can do similar:
KVM_SET_MEMORY_ATTR
struct kvm_memory_attr {
__u64 address;
__u64 size;
#define KVM_MEM_ATTR_SHARED BIT(0)
#define KVM_MEM_ATTR_READONLY BIT(1)
#define KVM_MEM_ATTR_NOEXEC BIT(2)
__u32 attr;
__u32 pad;
}
I'm not sure if we need KVM_GET_MEMORY_ATTR/KVM_HAS_MEMORY_ATTR as well,
but sounds like we need a KVM_UNSET_MEMORY_ATTR.
Since we are exposing the attribute directly to userspace I also think
we'd better treat shared memory as the default, so even when the private
memory is not used, the bit can still be meaningful. So define BIT(0) as
KVM_MEM_ATTR_PRIVATE instead of KVM_MEM_ATTR_SHARED.
Thanks,
Chao
>
> [*] https://lore.kernel.org/all/Y1a1i9vbJ%2FpVmV9r@google.com
>
> > + struct kvm_enc_region region;
> > + bool set = ioctl == KVM_MEMORY_ENCRYPT_REG_REGION;
> > +
> > + if (!kvm_arch_has_private_mem(kvm))
> > + goto arch_vm_ioctl;
> > +
> > + r = -EFAULT;
> > + if (copy_from_user(®ion, argp, sizeof(region)))
> > + goto out;
> > +
> > + r = kvm_vm_ioctl_set_mem_attr(kvm, region.addr,
> > + region.size, set);
> > + break;
> > + }
> > +#endif
Powered by blists - more mailing lists