lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 28 Oct 2022 15:04:27 +0800
From:   Xiaoyao Li <xiaoyao.li@...el.com>
To:     Chao Peng <chao.p.peng@...ux.intel.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-api@...r.kernel.org, linux-doc@...r.kernel.org,
        qemu-devel@...gnu.org
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
        Hugh Dickins <hughd@...gle.com>,
        Jeff Layton <jlayton@...nel.org>,
        "J . Bruce Fields" <bfields@...ldses.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>, Mike Rapoport <rppt@...nel.org>,
        Steven Price <steven.price@....com>,
        "Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
        Vlastimil Babka <vbabka@...e.cz>,
        Vishal Annapurve <vannapurve@...gle.com>,
        Yu Zhang <yu.c.zhang@...ux.intel.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        luto@...nel.org, jun.nakajima@...el.com, dave.hansen@...el.com,
        ak@...ux.intel.com, david@...hat.com, aarcange@...hat.com,
        ddutile@...hat.com, dhildenb@...hat.com,
        Quentin Perret <qperret@...gle.com>, tabba@...gle.com,
        Michael Roth <michael.roth@....com>, mhocko@...e.com,
        Muchun Song <songmuchun@...edance.com>, wei.w.wang@...el.com
Subject: Re: [PATCH v9 2/8] KVM: Extend the memslot to support fd-based
 private memory

On 10/25/2022 11:13 PM, Chao Peng wrote:
> In memory encryption usage, guest memory may be encrypted with special
> key and can be accessed only by the guest itself. We call such memory
> private memory. It's valueless and sometimes can cause problem to allow
> userspace to access guest private memory. This new KVM memslot extension
> allows guest private memory being provided though a restrictedmem
                                                  ^

typo

> backed file descriptor(fd) and userspace is restricted to access the
> bookmarked memory in the fd.
> 
> This new extension, indicated by the new flag KVM_MEM_PRIVATE, adds two
> additional KVM memslot fields restricted_fd/restricted_offset to allow
> userspace to instruct KVM to provide guest memory through restricted_fd.
> 'guest_phys_addr' is mapped at the restricted_offset of restricted_fd
> and the size is 'memory_size'.
> 
> The extended memslot can still have the userspace_addr(hva). When use, a
> single memslot can maintain both private memory through restricted_fd
> and shared memory through userspace_addr. Whether the private or shared
> part is visible to guest is maintained by other KVM code.
> 
> A restrictedmem_notifier field is also added to the memslot structure to
> allow the restricted_fd's backing store to notify KVM the memory change,
> KVM then can invalidate its page table entries.
> 
> Together with the change, a new config HAVE_KVM_RESTRICTED_MEM is added
> and right now it is selected on X86_64 only. A KVM_CAP_PRIVATE_MEM is
> also introduced to indicate KVM support for KVM_MEM_PRIVATE.
> 
> To make code maintenance easy, internally we use a binary compatible
> alias struct kvm_user_mem_region to handle both the normal and the
> '_ext' variants.
> 
> Co-developed-by: Yu Zhang <yu.c.zhang@...ux.intel.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@...ux.intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@...ux.intel.com>
> ---
>   Documentation/virt/kvm/api.rst | 48 ++++++++++++++++++++++++++++-----
>   arch/x86/kvm/Kconfig           |  2 ++
>   arch/x86/kvm/x86.c             |  2 +-
>   include/linux/kvm_host.h       | 13 +++++++--
>   include/uapi/linux/kvm.h       | 29 ++++++++++++++++++++
>   virt/kvm/Kconfig               |  3 +++
>   virt/kvm/kvm_main.c            | 49 ++++++++++++++++++++++++++++------
>   7 files changed, 128 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index eee9f857a986..f3fa75649a78 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -1319,7 +1319,7 @@ yet and must be cleared on entry.
>   :Capability: KVM_CAP_USER_MEMORY
>   :Architectures: all
>   :Type: vm ioctl
> -:Parameters: struct kvm_userspace_memory_region (in)
> +:Parameters: struct kvm_userspace_memory_region(_ext) (in)
>   :Returns: 0 on success, -1 on error
>   
>   ::
> @@ -1332,9 +1332,18 @@ yet and must be cleared on entry.
>   	__u64 userspace_addr; /* start of the userspace allocated memory */
>     };
>   
> +  struct kvm_userspace_memory_region_ext {
> +	struct kvm_userspace_memory_region region;
> +	__u64 restricted_offset;
> +	__u32 restricted_fd;
> +	__u32 pad1;
> +	__u64 pad2[14];
> +  };
> +
>     /* for kvm_memory_region::flags */
>     #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
>     #define KVM_MEM_READONLY	(1UL << 1)
> +  #define KVM_MEM_PRIVATE		(1UL << 2)
>   
>   This ioctl allows the user to create, modify or delete a guest physical
>   memory slot.  Bits 0-15 of "slot" specify the slot id and this value
> @@ -1365,12 +1374,27 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
>   be identical.  This allows large pages in the guest to be backed by large
>   pages in the host.
>   
> -The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
> -KVM_MEM_READONLY.  The former can be set to instruct KVM to keep track of
> -writes to memory within the slot.  See KVM_GET_DIRTY_LOG ioctl to know how to
> -use it.  The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
> -to make a new slot read-only.  In this case, writes to this memory will be
> -posted to userspace as KVM_EXIT_MMIO exits.
> +kvm_userspace_memory_region_ext struct includes all fields of
> +kvm_userspace_memory_region struct, while also adds additional fields for some
> +other features. See below description of flags field for more information.
> +It's recommended to use kvm_userspace_memory_region_ext in new userspace code.
> +
> +The flags field supports following flags:
> +
> +- KVM_MEM_LOG_DIRTY_PAGES to instruct KVM to keep track of writes to memory
> +  within the slot.  For more details, see KVM_GET_DIRTY_LOG ioctl.
> +
> +- KVM_MEM_READONLY, if KVM_CAP_READONLY_MEM allows, to make a new slot
> +  read-only.  In this case, writes to this memory will be posted to userspace as
> +  KVM_EXIT_MMIO exits.
> +
> +- KVM_MEM_PRIVATE, if KVM_CAP_PRIVATE_MEM allows, to indicate a new slot has
> +  private memory backed by a file descriptor(fd) and userspace access to the
> +  fd may be restricted. Userspace should use restricted_fd/restricted_offset in
> +  kvm_userspace_memory_region_ext to instruct KVM to provide private memory
> +  to guest. Userspace should guarantee not to map the same pfn indicated by
> +  restricted_fd/restricted_offset to different gfns with multiple memslots.
> +  Failed to do this may result undefined behavior.
>   
>   When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
>   the memory region are automatically reflected into the guest.  For example, an
> @@ -8215,6 +8239,16 @@ structure.
>   When getting the Modified Change Topology Report value, the attr->addr
>   must point to a byte where the value will be stored or retrieved from.
>   
> +8.36 KVM_CAP_PRIVATE_MEM
> +------------------------
> +
> +:Architectures: x86
> +
> +This capability indicates that private memory is supported and userspace can
> +set KVM_MEM_PRIVATE flag for KVM_SET_USER_MEMORY_REGION ioctl.  See
> +KVM_SET_USER_MEMORY_REGION for details on the usage of KVM_MEM_PRIVATE and
> +kvm_userspace_memory_region_ext fields.
> +
>   9. Known KVM API problems
>   =========================
>   
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 67be7f217e37..8d2bd455c0cd 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -49,6 +49,8 @@ config KVM
>   	select SRCU
>   	select INTERVAL_TREE
>   	select HAVE_KVM_PM_NOTIFIER if PM
> +	select HAVE_KVM_RESTRICTED_MEM if X86_64
> +	select RESTRICTEDMEM if HAVE_KVM_RESTRICTED_MEM
>   	help
>   	  Support hosting fully virtualized guest machines using hardware
>   	  virtualization extensions.  You will need a fairly recent
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4bd5f8a751de..02ad31f46dd7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12425,7 +12425,7 @@ void __user * __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa,
>   	}
>   
>   	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> -		struct kvm_userspace_memory_region m;
> +		struct kvm_user_mem_region m;
>   
>   		m.slot = id | (i << 16);
>   		m.flags = 0;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 32f259fa5801..739a7562a1f3 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -44,6 +44,7 @@
>   
>   #include <asm/kvm_host.h>
>   #include <linux/kvm_dirty_ring.h>
> +#include <linux/restrictedmem.h>
>   
>   #ifndef KVM_MAX_VCPU_IDS
>   #define KVM_MAX_VCPU_IDS KVM_MAX_VCPUS
> @@ -575,8 +576,16 @@ struct kvm_memory_slot {
>   	u32 flags;
>   	short id;
>   	u16 as_id;
> +	struct file *restricted_file;
> +	loff_t restricted_offset;
> +	struct restrictedmem_notifier notifier;
>   };
>   
> +static inline bool kvm_slot_can_be_private(const struct kvm_memory_slot *slot)
> +{
> +	return slot && (slot->flags & KVM_MEM_PRIVATE);
> +}
> +

We can introduce this function in patch 6 when it's first used.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ