[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4a38f76-d012-4ff4-a2a3-40af9a9a7052@redhat.com>
Date: Thu, 25 Apr 2024 07:52:52 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Matthew Wilcox <willy@...radead.org>, Vlastimil Babka <vbabka@...e.cz>
Cc: seanjc@...gle.com, michael.roth@....com, isaku.yamahata@...el.com,
Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [PATCH 04/11] filemap: add FGP_CREAT_ONLY
On 4/4/24 20:50, Paolo Bonzini wrote:
> KVM would like to add a ioctl to encrypt and install a page into private
> memory (i.e. into a guest_memfd), in preparation for launching an
> encrypted guest.
>
> This API should be used only once per page (unless there are failures),
> so we want to rule out the possibility of operating on a page that is
> already in the guest_memfd's filemap. Overwriting the page is almost
> certainly a sign of a bug, so we might as well forbid it.
>
> Therefore, introduce a new flag for __filemap_get_folio (to be passed
> together with FGP_CREAT) that allows *adding* a new page to the filemap
> but not returning an existing one.
>
> An alternative possibility would be to force KVM users to initialize
> the whole filemap in one go, but that is complicated by the fact that
> the filemap includes pages of different kinds, including some that are
> per-vCPU rather than per-VM. Basically the result would be closer to
> a system call that multiplexes multiple ioctls, than to something
> cleaner like readv/writev.
>
> Races between callers that pass FGP_CREAT_ONLY are uninteresting to
> the filemap code: one of the racers wins and one fails with EEXIST,
> similar to calling open(2) with O_CREAT|O_EXCL. It doesn't matter to
> filemap.c if the missing synchronization is in the kernel or in userspace,
> and in fact it could even be intentional. (In the case of KVM it turns
> out that a mutex is taken around these calls for unrelated reasons,
> so there can be no races.)
>
> Cc: Matthew Wilcox <willy@...radead.org>
> Cc: Yosry Ahmed <yosryahmed@...gle.com>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
Matthew, are your objections still valid or could I have your ack?
Thanks,
Paolo
> ---
> include/linux/pagemap.h | 2 ++
> mm/filemap.c | 4 ++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index f879c1d54da7..a8c0685e8c08 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -587,6 +587,7 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
> * * %FGP_CREAT - If no folio is present then a new folio is allocated,
> * added to the page cache and the VM's LRU list. The folio is
> * returned locked.
> + * * %FGP_CREAT_ONLY - Fail if a folio is present
> * * %FGP_FOR_MMAP - The caller wants to do its own locking dance if the
> * folio is already in cache. If the folio was allocated, unlock it
> * before returning so the caller can do the same dance.
> @@ -607,6 +608,7 @@ typedef unsigned int __bitwise fgf_t;
> #define FGP_NOWAIT ((__force fgf_t)0x00000020)
> #define FGP_FOR_MMAP ((__force fgf_t)0x00000040)
> #define FGP_STABLE ((__force fgf_t)0x00000080)
> +#define FGP_CREAT_ONLY ((__force fgf_t)0x00000100)
> #define FGF_GET_ORDER(fgf) (((__force unsigned)fgf) >> 26) /* top 6 bits */
>
> #define FGP_WRITEBEGIN (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7437b2bd75c1..e7440e189ebd 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1863,6 +1863,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
> folio = NULL;
> if (!folio)
> goto no_page;
> + if (fgp_flags & FGP_CREAT_ONLY) {
> + folio_put(folio);
> + return ERR_PTR(-EEXIST);
> + }
>
> if (fgp_flags & FGP_LOCK) {
> if (fgp_flags & FGP_NOWAIT) {
Powered by blists - more mailing lists