linux-kernel - Re: [RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of pgtable.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210201181607.GD15632@willie-the-truck>
Date:   Mon, 1 Feb 2021 18:16:08 +0000
From:   Will Deacon <will@...nel.org>
To:     Quentin Perret <qperret@...gle.com>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Marc Zyngier <maz@...nel.org>,
        James Morse <james.morse@....com>,
        Julien Thierry <julien.thierry.kdev@...il.com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Rob Herring <robh+dt@...nel.org>,
        Frank Rowand <frowand.list@...il.com>,
        devicetree@...r.kernel.org, android-kvm@...gle.com,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com,
        kvmarm@...ts.cs.columbia.edu, linux-arm-kernel@...ts.infradead.org,
        Fuad Tabba <tabba@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        David Brazdil <dbrazdil@...gle.com>
Subject: Re: [RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of
 pgtable.c

On Fri, Jan 08, 2021 at 12:15:04PM +0000, Quentin Perret wrote:
> In preparation for enabling the creation of page-tables at EL2, factor
> all memory allocation out of the page-table code, hence making it
> re-usable with any compatible memory allocator.
> 
> No functional changes intended.
> 
> Signed-off-by: Quentin Perret <qperret@...gle.com>
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 32 +++++++++-
>  arch/arm64/kvm/hyp/pgtable.c         | 90 +++++++++++++++++-----------
>  arch/arm64/kvm/mmu.c                 | 70 +++++++++++++++++++++-
>  3 files changed, 154 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 52ab38db04c7..45acc9dc6c45 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -13,17 +13,41 @@
>  
>  typedef u64 kvm_pte_t;
>  
> +/**
> + * struct kvm_pgtable_mm_ops - Memory management callbacks.
> + * @zalloc_page:	Allocate a zeroed memory page.

Please describe the 'arg' parameter.

> + * @zalloc_pages_exact:	Allocate an exact number of zeroed memory pages.

I think this comment coulld be expanded somewhat to make it clear that (a)
the 'size' parameter is in bytes rather than pages (b) the rounding
behaviour applied if 'size' is not page-aligned and (c) that the resulting
allocation is physically contiguous.

> + * @free_pages_exact:	Free an exact number of memory pages.
> + * @get_page:		Increment the refcount on a page.
> + * @put_page:		Decrement the refcount on a page.
> + * @page_count:		Returns the refcount of a page.
> + * @phys_to_virt:	Convert a physical address into a virtual address.
> + * @virt_to_phys:	Convert a virtual address into a physical address.

I think it would be good to be explicit about the nature of the virtual
address here. We've dealing with virtual addresses that are mapped in the
current context rather than e.g. guest virtual addresses.

> + */
> +struct kvm_pgtable_mm_ops {
> +	void*		(*zalloc_page)(void *arg);
> +	void*		(*zalloc_pages_exact)(size_t size);
> +	void		(*free_pages_exact)(void *addr, size_t size);
> +	void		(*get_page)(void *addr);
> +	void		(*put_page)(void *addr);
> +	int		(*page_count)(void *addr);
> +	void*		(*phys_to_virt)(phys_addr_t phys);
> +	phys_addr_t	(*virt_to_phys)(void *addr);
> +};

[...]

> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 1f41173e6149..278e163beda4 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -88,6 +88,48 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>  	return !pfn_valid(pfn);
>  }
>  
> +static void *stage2_memcache_alloc_page(void *arg)
> +{
> +	struct kvm_mmu_memory_cache *mc = arg;
> +	kvm_pte_t *ptep = NULL;
> +
> +	/* Allocated with GFP_KERNEL_ACCOUNT, so no need to zero */

I couldn't spot where GFP_KERNEL_ACCOUNT implies __GFP_ZERO. Please can you
elaborate?

> +	if (mc && mc->nobjs)
> +		ptep = mc->objects[--mc->nobjs];
> +
> +	return ptep;
> +}

Why can't we use kvm_mmu_memory_cache_alloc() directly instead of opening up
the memory_cache?

> +static void *kvm_host_zalloc_pages_exact(size_t size)
> +{
> +	return alloc_pages_exact(size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);

Hmm, so now we're passing __GFP_ZERO? ;)

> +static void kvm_host_get_page(void *addr)
> +{
> +	get_page(virt_to_page(addr));
> +}
> +
> +static void kvm_host_put_page(void *addr)
> +{
> +	put_page(virt_to_page(addr));
> +}
> +
> +static int kvm_host_page_count(void *addr)
> +{
> +	return page_count(virt_to_page(addr));
> +}
> +
> +static phys_addr_t kvm_host_pa(void *addr)
> +{
> +	return __pa(addr);
> +}
> +
> +static void *kvm_host_va(phys_addr_t phys)
> +{
> +	return __va(phys);
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -351,6 +393,17 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
>  	return 0;
>  }
>  
> +static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = {
> +	.zalloc_page		= stage2_memcache_alloc_page,
> +	.zalloc_pages_exact	= kvm_host_zalloc_pages_exact,
> +	.free_pages_exact	= free_pages_exact,
> +	.get_page		= kvm_host_get_page,
> +	.put_page		= kvm_host_put_page,
> +	.page_count		= kvm_host_page_count,
> +	.phys_to_virt		= kvm_host_va,
> +	.virt_to_phys		= kvm_host_pa,
> +};

Idle thought, but I wonder whether it would be better to have these
implementations as the default and make the mm_ops structure parameter
to kvm_pgtable_stage2_init() optional? I guess you don't gain an awful
lot though, so feel free to ignore me.

Will