lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250402160721.97596-1-kalyazin@amazon.com>
Date: Wed, 2 Apr 2025 16:07:16 +0000
From: Nikita Kalyazin <kalyazin@...zon.com>
To: <akpm@...ux-foundation.org>, <pbonzini@...hat.com>, <shuah@...nel.org>
CC: <kvm@...r.kernel.org>, <linux-kselftest@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<lorenzo.stoakes@...cle.com>, <david@...hat.com>, <ryan.roberts@....com>,
	<quic_eberman@...cinc.com>, <jthoughton@...gle.com>, <peterx@...hat.com>,
	<graf@...zon.de>, <jgowans@...zon.com>, <roypat@...zon.co.uk>,
	<derekmn@...zon.com>, <nsaenz@...zon.es>, <xmarcalx@...zon.com>,
	<kalyazin@...zon.com>
Subject: [PATCH v2 0/5] KVM: guest_memfd: support for uffd minor

This series is built on top of Fuad's v7 "mapping guest_memfd backed
memory at the host" [1].

With James's KVM userfault [2], it is possible to handle stage-2 faults
in guest_memfd in userspace.  However, KVM itself also triggers faults
in guest_memfd in some cases, for example: PV interfaces like kvmclock,
PV EOI and page table walking code when fetching the MMIO instruction on
x86.  It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3]
that KVM would be accessing those pages via userspace page tables.  In
order for such faults to be handled in userspace, guest_memfd needs to
support userfaultfd.

Changes since v1 [4]:
 - James, Peter: implement a full minor trap instead of a hybrid
   missing/minor trap
 - James, Peter: to avoid shmem- and guest_memfd-specific code in the
   UFFDIO_CONTINUE implementation make it generic by calling
vm_ops->fault()

While generalising UFFDIO_CONTINUE implementation helped avoid
guest_memfd-specific code in mm/userfaulfd, userfaultfd still needs
access to KVM code to be able to verify the VMA type when handling
UFFDIO_REGISTER_MODE_MINOR, so I used a similar approach to what Fuad
did for now [5].

In v1, Peter was mentioning a potential for eliminating taking a folio
lock [6].  I did not implement that, but according to my testing, the
performance of shmem minor fault handling stayed the same after the
migration to calling vm_ops->fault() (tested on an x86).

Before:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:	10.979277020s
Per-vcpu demand paging rate:	23876.253375 pgs/sec/vcpu
Overall demand paging rate:	23876.253375 pgs/sec

After:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:	10.978893504s
Per-vcpu demand paging rate:	23877.087423 pgs/sec/vcpu
Overall demand paging rate:	23877.087423 pgs/sec

Nikita

[1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/
[2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/
[3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3
[4] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/
[5] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/#Z2e.:..:20250318161823.4005529-3-tabba::40google.com:1mm:swap.c
[6] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/#m8695dc24d2cc633a6a486a8990e3f7d50d4efb79

Nikita Kalyazin (5):
  mm: userfaultfd: generic continue for non hugetlbfs
  KVM: guest_memfd: add kvm_gmem_vma_is_gmem
  mm: userfaultfd: allow to register continue for guest_memfd
  KVM: guest_memfd: add support for userfaultfd minor
  KVM: selftests: test userfaultfd minor for guest_memfd

 include/linux/mm_types.h                      |  3 +
 include/linux/userfaultfd_k.h                 | 13 ++-
 mm/hugetlb.c                                  |  2 +-
 mm/shmem.c                                    |  3 +-
 mm/userfaultfd.c                              | 25 +++--
 .../testing/selftests/kvm/guest_memfd_test.c  | 94 +++++++++++++++++++
 virt/kvm/guest_memfd.c                        | 15 +++
 virt/kvm/kvm_mm.h                             |  1 +
 8 files changed, 146 insertions(+), 10 deletions(-)


base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0
-- 
2.47.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ