lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241118123948.4796-1-kalyazin@amazon.com>
Date: Mon, 18 Nov 2024 12:39:42 +0000
From: Nikita Kalyazin <kalyazin@...zon.com>
To: <pbonzini@...hat.com>, <seanjc@...gle.com>, <corbet@....net>,
	<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>, <hpa@...or.com>, <rostedt@...dmis.org>,
	<mhiramat@...nel.org>, <mathieu.desnoyers@...icios.com>,
	<kvm@...r.kernel.org>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-trace-kernel@...r.kernel.org>
CC: <jthoughton@...gle.com>, <david@...hat.com>, <peterx@...hat.com>,
	<oleg@...hat.com>, <vkuznets@...hat.com>, <gshan@...hat.com>,
	<graf@...zon.de>, <jgowans@...zon.com>, <roypat@...zon.co.uk>,
	<derekmn@...zon.com>, <nsaenz@...zon.es>, <xmarcalx@...zon.com>,
	<kalyazin@...zon.com>
Subject: [RFC PATCH 0/6] KVM: x86: async PF user

Async PF [1] allows to run other processes on a vCPU while the host
handles a stage-2 fault caused by a process on that vCPU. When using
VM-exit-based stage-2 fault handling [2], async PF functionality is lost
because KVM does not run the vCPU while a fault is being handled so no
other process can execute on the vCPU. This patch series extends
VM-exit-based stage-2 fault handling with async PF support by letting
userspace handle faults instead of the kernel, hence the "async PF user"
name.

I circulated the idea with Paolo, Sean, David H, and James H at the LPC,
and the only concern I heard was about injecting the "page not present"
event via #PF exception in the CoCo case, where it may not work. In my
implementation, I reused the existing code for doing that, so the async
PF user implementation is on par with the present async PF
implementation in this regard, and support for the CoCo case can be
added separately.

Please note that this series is applied on top of the VM-exit-based
stage-2 fault handling RFC [2].

Implementation

The following workflow is implemented:
 - A process in the guest causes a stage-2 fault.
 - KVM checks whether the fault can be handled asynchronously. If it
   can, KVM prepares the VM exit info that contains a newly added "async
   PF flag" raised and an async PF token value corresponding to the
   fault.
 - Userspace reads the VM exit info and resumes the vCPU immediately.
   Meanwhile it processes the fault.
 - When the fault is resolved, userspace calls a new async ioctl using
   the token to notify KVM.
 - KVM communicates to the guest that the process can be resumed.

Notes:
 - No changes to the x86 async PF PV interface are required
 - The series does not introduce new dependencies on x86 compared to the
   existing async PF

Testing

Inspired by [3], I built a Firecracker-based setup, where Firecracker
implemented the VM-exit-based fault handling. I observed that a workload
consisting of a CPU-bound and memory-bound threads running concurrently
was executing faster with async PF user enabled: with 10 ms-long fault
processing, it was 26% faster.

It is difficult to provide an objective performance comparison between
async PF kernel and async PF user, because async PF user can only work
with VM-exit-based fault handling, which has its own performance
characteristics compared to in-kernel fault handling or UserfaultFD.

The patch series is built on top of the VM-exit-based stage-2 fault
handling RFC [2].

Patch 1 updates documentation to reflect [2] changes.
Patches 2-6 add the implementation of async PF user.

Questions:
 - Are there any general concerns about the approach?
 - Can we leave the CoCo use case aside for now, or do we need to
   support it straight away?
 - What is the desired level of coupling between async PF and async PF
   user? For now, I kept the coupling to the bare minimum (only the
   PV-related data structure is shared between the two).

[1] https://kvm-forum.qemu.org/2021/sdei_apf_for_arm64_gavin.pdf
[2] https://lore.kernel.org/kvm/CADrL8HUHRMwUPhr7jLLBgD9YLFAnVHc=N-C=8er-x6GUtV97pQ@mail.gmail.com/T/
[3] https://lore.kernel.org/all/20200508032919.52147-1-gshan@redhat.com/

Nikita

Nikita Kalyazin (6):
  Documentation: KVM: add userfault KVM exit flag
  Documentation: KVM: add async pf user doc
  KVM: x86: add async ioctl support
  KVM: trace events: add type argument to async pf
  KVM: x86: async_pf_user: add infrastructure
  KVM: x86: async_pf_user: hook to fault handling and add ioctl

 Documentation/virt/kvm/api.rst  |  35 ++++++
 arch/x86/include/asm/kvm_host.h |  12 +-
 arch/x86/kvm/Kconfig            |   7 ++
 arch/x86/kvm/lapic.c            |   2 +
 arch/x86/kvm/mmu/mmu.c          |  68 ++++++++++-
 arch/x86/kvm/x86.c              | 101 +++++++++++++++-
 arch/x86/kvm/x86.h              |   2 +
 include/linux/kvm_host.h        |  30 +++++
 include/linux/kvm_types.h       |   1 +
 include/trace/events/kvm.h      |  50 +++++---
 include/uapi/linux/kvm.h        |  12 +-
 virt/kvm/Kconfig                |   3 +
 virt/kvm/Makefile.kvm           |   1 +
 virt/kvm/async_pf.c             |   2 +-
 virt/kvm/async_pf_user.c        | 197 ++++++++++++++++++++++++++++++++
 virt/kvm/async_pf_user.h        |  24 ++++
 virt/kvm/kvm_main.c             |  14 +++
 17 files changed, 535 insertions(+), 26 deletions(-)
 create mode 100644 virt/kvm/async_pf_user.c
 create mode 100644 virt/kvm/async_pf_user.h


base-commit: 15f01813426bf9672e2b24a5bac7b861c25de53b
-- 
2.40.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ