lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210527125134.2116404-1-qperret@google.com>
Date:   Thu, 27 May 2021 12:51:27 +0000
From:   Quentin Perret <qperret@...gle.com>
To:     maz@...nel.org, will@...nel.org, james.morse@....com,
        alexandru.elisei@....com, catalin.marinas@....com,
        suzuki.poulose@....com
Cc:     linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        kernel-team@...roid.com, linux-kernel@...r.kernel.org
Subject: [PATCH 0/7] KVM: arm64: Reduce hyp_vmemmap overhead

Hi all,

When running in nVHE protected mode, the hypervisor manages its own
vmemmap and uses it to store page metadata, e.g. refcounts. This series
shrinks the size of struct hyp_page from 32 bytes to 4 bytes without
loss of functionality, hence reducing the cost of the hyp vmemmap from
8MB/GB to 1MB/GB with 4K pages.

The series has two immediate benefits:
  - the memory overhead of the nVHE protected mode is reduced;
  - it refactors the host stage-2 memory pools in a way that allows
    better re-use of pages to map MMIO ranges, allowing more MMIO
    mappings (currently limited to 1GB IPA space) most of the time.

But more importantly, the series reduces the hyp vmemmap overhead enough
that we might consider covering _all_ of memory with it at EL2 in the
future. This would simplify significantly the dynamic admission of
memory into the EL2 allocator, which will be required when the
hypervisor will allocate stage-2 page-tables of guests for instance.
This would also allow the hypervisor to refcount pages it doesn't 'own',
which be useful to track shared pages and such.

The series is split as follows
  - patches 01-03 move the list_head of each page from struct hyp_page
    to the page itself -- the pages are attached to the free lists only
    when they are free by definition;
  - patches 04-05 remove the hyp_pool pointer from struct hyp_page as
    that information can be inferred from the context;
  - patches 06-07 reduce the size of the remaining members of struct
    hyp_page which are currently oversized for the needs of the
    hypervisor.

On a last note, I believe we could actually make hyp_page fit in 16bits
when using 4K pages: limiting the MAX_ORDER to 7 should suffice and
require only 3 bits, and 13bits should be enough for the refcount for
the existing use-cases. I decided not to implement this as we probably
want to keep some room to grow in hyp_page (e.g. add flags, ...), that
might cause issues to make refcounts atomic, and 16bits are not enough
with 64K pages so we'd have to deal with that separately, but that _is_
a possibility.

Thanks!
Quentin

Quentin Perret (7):
  KVM: arm64: Move hyp_pool locking out of refcount helpers
  KVM: arm64: Use refcount at hyp to check page availability
  KVM: arm64: Remove list_head from hyp_page
  KVM: arm64: Unify MMIO and mem host stage-2 pools
  KVM: arm64: Remove hyp_pool pointer from struct hyp_page
  KVM: arm64: Use less bits for hyp_page order
  KVM: arm64: Use less bits for hyp_page refcount

 arch/arm64/kvm/hyp/include/nvhe/gfp.h         | 33 ++-----
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  7 +-
 arch/arm64/kvm/hyp/include/nvhe/mm.h          | 13 +--
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 59 +++++++------
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          | 87 ++++++++++++-------
 arch/arm64/kvm/hyp/nvhe/setup.c               | 30 ++++---
 arch/arm64/kvm/hyp/reserved_mem.c             |  3 +-
 8 files changed, 123 insertions(+), 111 deletions(-)

-- 
2.31.1.818.g46aad6cb9e-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ