linux-kernel - Re: [RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of pgtable.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YBhJVPz124QRAYFf@google.com>
Date:   Mon, 1 Feb 2021 18:32:52 +0000
From:   Quentin Perret <qperret@...gle.com>
To:     Will Deacon <will@...nel.org>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Marc Zyngier <maz@...nel.org>,
        James Morse <james.morse@....com>,
        Julien Thierry <julien.thierry.kdev@...il.com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Rob Herring <robh+dt@...nel.org>,
        Frank Rowand <frowand.list@...il.com>,
        devicetree@...r.kernel.org, android-kvm@...gle.com,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com,
        kvmarm@...ts.cs.columbia.edu, linux-arm-kernel@...ts.infradead.org,
        Fuad Tabba <tabba@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        David Brazdil <dbrazdil@...gle.com>
Subject: Re: [RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of
 pgtable.c

On Monday 01 Feb 2021 at 18:16:08 (+0000), Will Deacon wrote:
> On Fri, Jan 08, 2021 at 12:15:04PM +0000, Quentin Perret wrote:
> > In preparation for enabling the creation of page-tables at EL2, factor
> > all memory allocation out of the page-table code, hence making it
> > re-usable with any compatible memory allocator.
> > 
> > No functional changes intended.
> > 
> > Signed-off-by: Quentin Perret <qperret@...gle.com>
> > ---
> >  arch/arm64/include/asm/kvm_pgtable.h | 32 +++++++++-
> >  arch/arm64/kvm/hyp/pgtable.c         | 90 +++++++++++++++++-----------
> >  arch/arm64/kvm/mmu.c                 | 70 +++++++++++++++++++++-
> >  3 files changed, 154 insertions(+), 38 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > index 52ab38db04c7..45acc9dc6c45 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -13,17 +13,41 @@
> >  
> >  typedef u64 kvm_pte_t;
> >  
> > +/**
> > + * struct kvm_pgtable_mm_ops - Memory management callbacks.
> > + * @zalloc_page:	Allocate a zeroed memory page.
> 
> Please describe the 'arg' parameter.
> 
> > + * @zalloc_pages_exact:	Allocate an exact number of zeroed memory pages.
> 
> I think this comment coulld be expanded somewhat to make it clear that (a)
> the 'size' parameter is in bytes rather than pages (b) the rounding
> behaviour applied if 'size' is not page-aligned and (c) that the resulting
> allocation is physically contiguous.
> 
> > + * @free_pages_exact:	Free an exact number of memory pages.
> > + * @get_page:		Increment the refcount on a page.
> > + * @put_page:		Decrement the refcount on a page.
> > + * @page_count:		Returns the refcount of a page.
> > + * @phys_to_virt:	Convert a physical address into a virtual address.
> > + * @virt_to_phys:	Convert a virtual address into a physical address.
> 
> I think it would be good to be explicit about the nature of the virtual
> address here. We've dealing with virtual addresses that are mapped in the
> current context rather than e.g. guest virtual addresses.

Ack to all the above.

> > + */
> > +struct kvm_pgtable_mm_ops {
> > +	void*		(*zalloc_page)(void *arg);
> > +	void*		(*zalloc_pages_exact)(size_t size);
> > +	void		(*free_pages_exact)(void *addr, size_t size);
> > +	void		(*get_page)(void *addr);
> > +	void		(*put_page)(void *addr);
> > +	int		(*page_count)(void *addr);
> > +	void*		(*phys_to_virt)(phys_addr_t phys);
> > +	phys_addr_t	(*virt_to_phys)(void *addr);
> > +};
> 
> [...]
> 
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 1f41173e6149..278e163beda4 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -88,6 +88,48 @@ static bool kvm_is_device_pfn(unsigned long pfn)
> >  	return !pfn_valid(pfn);
> >  }
> >  
> > +static void *stage2_memcache_alloc_page(void *arg)
> > +{
> > +	struct kvm_mmu_memory_cache *mc = arg;
> > +	kvm_pte_t *ptep = NULL;
> > +
> > +	/* Allocated with GFP_KERNEL_ACCOUNT, so no need to zero */
> 
> I couldn't spot where GFP_KERNEL_ACCOUNT implies __GFP_ZERO.

I'm not suprised, it doesn't. Broken comment clearly, I'll fix with
s/GFP_KERNEL_ACCOUNT/__GFP_ZERO

> Please can you elaborate?
> 
> > +	if (mc && mc->nobjs)
> > +		ptep = mc->objects[--mc->nobjs];
> > +
> > +	return ptep;
> > +}
> 
> Why can't we use kvm_mmu_memory_cache_alloc() directly instead of opening up
> the memory_cache?

I think we can -- that function didn't exist when I first wrote this,
but no good reason not to use it now.

> > +static void *kvm_host_zalloc_pages_exact(size_t size)
> > +{
> > +	return alloc_pages_exact(size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> 
> Hmm, so now we're passing __GFP_ZERO? ;)

:-)

> > +static void kvm_host_get_page(void *addr)
> > +{
> > +	get_page(virt_to_page(addr));
> > +}
> > +
> > +static void kvm_host_put_page(void *addr)
> > +{
> > +	put_page(virt_to_page(addr));
> > +}
> > +
> > +static int kvm_host_page_count(void *addr)
> > +{
> > +	return page_count(virt_to_page(addr));
> > +}
> > +
> > +static phys_addr_t kvm_host_pa(void *addr)
> > +{
> > +	return __pa(addr);
> > +}
> > +
> > +static void *kvm_host_va(phys_addr_t phys)
> > +{
> > +	return __va(phys);
> > +}
> > +
> >  /*
> >   * Unmapping vs dcache management:
> >   *
> > @@ -351,6 +393,17 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
> >  	return 0;
> >  }
> >  
> > +static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = {
> > +	.zalloc_page		= stage2_memcache_alloc_page,
> > +	.zalloc_pages_exact	= kvm_host_zalloc_pages_exact,
> > +	.free_pages_exact	= free_pages_exact,
> > +	.get_page		= kvm_host_get_page,
> > +	.put_page		= kvm_host_put_page,
> > +	.page_count		= kvm_host_page_count,
> > +	.phys_to_virt		= kvm_host_va,
> > +	.virt_to_phys		= kvm_host_pa,
> > +};
> 
> Idle thought, but I wonder whether it would be better to have these
> implementations as the default and make the mm_ops structure parameter
> to kvm_pgtable_stage2_init() optional? I guess you don't gain an awful
> lot though, so feel free to ignore me.

No strong opinion really, but I suppose I could do something as simple
as having static inline wrappers which provide kvm_s2_mm_ops to the
pgtable API for me. I'll probably want to make sure these are not
defined when compiling EL2 code, though, to avoid confusion.

Or maybe you had something else in mind?

Cheers,
Quentin