lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b79711c-ee0f-47aa-b42f-51f13ac0bd5c@amazon.com>
Date: Fri, 14 Nov 2025 15:23:22 +0000
From: Nikita Kalyazin <kalyazin@...zon.com>
To: "Kalyazin, Nikita" <kalyazin@...zon.co.uk>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "shuah@...nel.org" <shuah@...nel.org>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "david@...nel.org"
	<david@...nel.org>, "jthoughton@...gle.com" <jthoughton@...gle.com>,
	"ackerleytng@...gle.com" <ackerleytng@...gle.com>, "vannapurve@...gle.com"
	<vannapurve@...gle.com>, "jackmanb@...gle.com" <jackmanb@...gle.com>,
	"patrick.roy@...ux.dev" <patrick.roy@...ux.dev>, "Thomson, Jack"
	<jackabt@...zon.co.uk>, "Itazuri, Takahiro" <itazur@...zon.co.uk>,
	"Manwaring, Derek" <derekmn@...zon.com>, "Cali, Marco"
	<xmarcalx@...zon.co.uk>
Subject: Re: [PATCH v7 0/2] KVM: guest_memfd: use write for population



On 14/11/2025 15:18, Kalyazin, Nikita wrote:
> On systems that support shared guest memory, write() is useful, for
> example, for population of the initial image.  Even though the same can
> also be achieved via userspace mapping and memcpying from userspace,
> write() provides a more performant option because it does not need to
> set user page tables and it does not cause a page fault for every page
> like memcpy would.  Note that memcpy cannot be accelerated via
> MADV_POPULATE_WRITE as it is not supported by guest_memfd and relies on
> GUP.
> 
> Populating 512MiB of guest_memfd on a x86 machine:
>   - via memcpy: 436 ms
>   - via write:  202 ms (-54%)
> 
> Only PAGE_ALIGNED offset and len are allowed.  Even though non-aligned
> writes are technically possible, when in-place conversion support is
> implemented [1], the restriction makes handling of mixed shared/private
> huge pages simpler.  write() will only be allowed to populate shared
> pages.
> 
> When direct map removal is implemented [2]
>   - write() will not be allowed to access pages that have already
>     been removed from direct map
>   - on completion, write() will remove the populated pages from
>     direct map
> 
> While it is technically possible to implement read() syscall on systems
> with shared guest memory, it is not supported as there is currently no
> use case for it.
> 
> [1]
> https://lore.kernel.org/kvm/cover.1760731772.git.ackerleytng@google.com
> [2]
> https://lore.kernel.org/kvm/20250924151101.2225820-1-patrick.roy@campus.lmu.de

I failed to include links to previous versions:

v7:
  - Sean: add GUEST_MEMFD_FLAG_WRITE and documentation for it
  - Ackerley: only allow PAGE_ALIGNED offset and len
  - Sean/Ackerley: formatting fixes

v6:
  - https://lore.kernel.org/kvm/20251020161352.69257-1-kalyazin@amazon.com
  - Make write support conditional on mmap support instead of relying on
    the up-to-date flag to decide whether writing to a page is allowed
  - James: Remove dependencies on folio_test_large
  - James: Remove page alignment restriction
  - James: Formatting fixes

v5:
  - https://lore.kernel.org/kvm/20250902111951.58315-1-kalyazin@amazon.com
  - Replace the call to the unexported filemap_remove_folio with
    zeroing the bytes that could not be copied
  - Fix checkpatch findings

v4:
  - https://lore.kernel.org/kvm/20250828153049.3922-1-kalyazin@amazon.com
  - Switch from implementing the write callback to write_iter
  - Remove conditional compilation

v3:
  - https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@amazon.com
  - David/Mike D: Only compile support for the write syscall if
    CONFIG_KVM_GMEM_SHARED_MEM (now gone) is enabled.
v2:
  - https://lore.kernel.org/kvm/20241129123929.64790-1-kalyazin@amazon.com
  - Switch from an ioctl to the write syscall to implement population

v1:
  - https://lore.kernel.org/kvm/20241024095429.54052-1-kalyazin@amazon.com

> 
> Nikita Kalyazin (2):
>    KVM: guest_memfd: add generic population via write
>    KVM: selftests: update guest_memfd write tests
> 
>   Documentation/virt/kvm/api.rst                |  2 +
>   include/linux/kvm_host.h                      |  2 +-
>   include/uapi/linux/kvm.h                      |  1 +
>   .../testing/selftests/kvm/guest_memfd_test.c  | 58 +++++++++++++++++--
>   virt/kvm/guest_memfd.c                        | 52 +++++++++++++++++
>   5 files changed, 108 insertions(+), 7 deletions(-)
> 
> 
> base-commit: 8a4821412cf2c1429fffa07c012dd150f2edf78c
> --
> 2.50.1
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ