[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220803100835.GD607465@chaop.bj.intel.com>
Date: Wed, 3 Aug 2022 18:08:35 +0800
From: Chao Peng <chao.p.peng@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-api@...r.kernel.org, linux-doc@...r.kernel.org,
qemu-devel@...gnu.org, linux-kselftest@...r.kernel.org,
Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Hugh Dickins <hughd@...gle.com>,
Jeff Layton <jlayton@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Shuah Khan <shuah@...nel.org>, Mike Rapoport <rppt@...nel.org>,
Steven Price <steven.price@....com>,
"Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
Vlastimil Babka <vbabka@...e.cz>,
Vishal Annapurve <vannapurve@...gle.com>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
luto@...nel.org, jun.nakajima@...el.com, dave.hansen@...el.com,
ak@...ux.intel.com, david@...hat.com, aarcange@...hat.com,
ddutile@...hat.com, dhildenb@...hat.com,
Quentin Perret <qperret@...gle.com>,
Michael Roth <michael.roth@....com>, mhocko@...e.com,
Muchun Song <songmuchun@...edance.com>
Subject: Re: [PATCH v7 09/14] KVM: Extend the memslot to support fd-based
private memory
On Fri, Jul 29, 2022 at 07:51:29PM +0000, Sean Christopherson wrote:
> On Wed, Jul 06, 2022, Chao Peng wrote:
> > @@ -1332,9 +1332,18 @@ yet and must be cleared on entry.
> > __u64 userspace_addr; /* start of the userspace allocated memory */
> > };
> >
> > + struct kvm_userspace_memory_region_ext {
> > + struct kvm_userspace_memory_region region;
> > + __u64 private_offset;
> > + __u32 private_fd;
> > + __u32 pad1;
> > + __u64 pad2[14];
> > +};
> > +
> > /* for kvm_memory_region::flags */
> > #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
> > #define KVM_MEM_READONLY (1UL << 1)
> > + #define KVM_MEM_PRIVATE (1UL << 2)
>
> Very belatedly following up on prior feedback...
>
> | I think a flag is still needed, the problem is private_fd can be safely
> | accessed only when this flag is set, e.g. without this flag, we can't
> | copy_from_user these new fields since they don't exist for previous
> | kvm_userspace_memory_region callers.
>
> I forgot about that aspect of things. We don't technically need a dedicated
> PRIVATE flag to handle that, but it does seem to be the least awful soltuion.
> We could either add a generic KVM_MEM_EXTENDED_REGION or an entirely new
> ioctl(), e.g. KVM_SET_USER_MEMORY_REGION2, but in both approaches there's a decent
> chance that we'll end up needed individual "this field is valid" flags anways.
>
> E.g. if KVM requires pad1 and pad2 to be zero to carve out future extensions,
> then we're right back here if some future extension needs to treat '0' as a legal
> input.
I had such practice (always rejecting none-zero 'pad' value when
introducing new user APIs) in other project previously, but I rarely
see that in KVM.
>
> TL;DR: adding KVM_MEM_PRIVATE still seems like the best approach.
>
> > @@ -4631,14 +4658,35 @@ static long kvm_vm_ioctl(struct file *filp,
> > break;
> > }
> > case KVM_SET_USER_MEMORY_REGION: {
> > - struct kvm_userspace_memory_region kvm_userspace_mem;
> > + struct kvm_user_mem_region mem;
> > + unsigned long size;
> > + u32 flags;
> > +
> > + kvm_sanity_check_user_mem_region_alias();
> > +
> > + memset(&mem, 0, sizeof(mem));
> >
> > r = -EFAULT;
> > - if (copy_from_user(&kvm_userspace_mem, argp,
> > - sizeof(kvm_userspace_mem)))
> > +
> > + if (get_user(flags,
> > + (u32 __user *)(argp + offsetof(typeof(mem), flags))))
> > + goto out;
>
>
> Indentation is funky. It's hard to massage this into something short and
> readable What about capturing the offset separately? E.g.
>
> struct kvm_user_mem_region mem;
> unsigned int flags_offset = offsetof(typeof(mem), flags));
> unsigned long size;
> u32 flags;
>
> kvm_sanity_check_user_mem_region_alias();
>
> memset(&mem, 0, sizeof(mem));
>
> r = -EFAULT;
> if (get_user(flags, (u32 __user *)(argp + flags_offset)))
> goto out;
>
> But this can actually be punted until KVM_MEM_PRIVATE is fully supported. As of
> this patch, KVM doesn't read the extended size, so I believe the diff for this
> patch can simply be:
Looks good to me, Thanks.
Chao
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index da263c370d00..5194beb7b52f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4640,6 +4640,10 @@ static long kvm_vm_ioctl(struct file *filp,
> sizeof(kvm_userspace_mem)))
> goto out;
>
> + r = -EINVAL;
> + if (mem.flags & KVM_MEM_PRIVATE)
> + goto out;
> +
> r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem);
> break;
> }
Powered by blists - more mailing lists