lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZsX4_1TlIu6WNo7r@kernel.org>
Date: Wed, 21 Aug 2024 17:26:07 +0300
From: Mike Rapoport <rppt@...nel.org>
To: Elliot Berman <quic_eberman@...cinc.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Sean Christopherson <seanjc@...gle.com>,
	Fuad Tabba <tabba@...gle.com>, David Hildenbrand <david@...hat.com>,
	Patrick Roy <roypat@...zon.co.uk>, qperret@...gle.com,
	Ackerley Tng <ackerleytng@...gle.com>, linux-coco@...ts.linux.dev,
	linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, kvm@...r.kernel.org
Subject: Re: [PATCH RFC 3/4] mm: guest_memfd: Add option to remove guest
 private memory from direct map

On Tue, Aug 20, 2024 at 09:56:10AM -0700, Elliot Berman wrote:
> On Mon, Aug 19, 2024 at 01:09:52PM +0300, Mike Rapoport wrote:
> > On Mon, Aug 05, 2024 at 11:34:49AM -0700, Elliot Berman wrote:
> > > This patch was reworked from Patrick's patch:
> > > https://lore.kernel.org/all/20240709132041.3625501-6-roypat@amazon.co.uk/
> > > 
> > > While guest_memfd is not available to be mapped by userspace, it is
> > > still accessible through the kernel's direct map. This means that in
> > > scenarios where guest-private memory is not hardware protected, it can
> > > be speculatively read and its contents potentially leaked through
> > > hardware side-channels. Removing guest-private memory from the direct
> > > map, thus mitigates a large class of speculative execution issues
> > > [1, Table 1].
> > > 
> > > Direct map removal do not reuse the `.prepare` machinery, since
> > > `prepare` can be called multiple time, and it is the responsibility of
> > > the preparation routine to not "prepare" the same folio twice [2]. Thus,
> > > instead explicitly check if `filemap_grab_folio` allocated a new folio,
> > > and remove the returned folio from the direct map only if this was the
> > > case.
> > > 
> > > The patch uses release_folio instead of free_folio to reinsert pages
> > > back into the direct map as by the time free_folio is called,
> > > folio->mapping can already be NULL. This means that a call to
> > > folio_inode inside free_folio might deference a NULL pointer, leaving no
> > > way to access the inode which stores the flags that allow determining
> > > whether the page was removed from the direct map in the first place.
> > > 
> > > [1]: https://download.vusec.net/papers/quarantine_raid23.pdf
> > > 
> > > Cc: Patrick Roy <roypat@...zon.co.uk>
> > > Signed-off-by: Elliot Berman <quic_eberman@...cinc.com>
> > > ---
> > >  include/linux/guest_memfd.h |  8 ++++++
> > >  mm/guest_memfd.c            | 65 ++++++++++++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 72 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/guest_memfd.h b/include/linux/guest_memfd.h
> > > index be56d9d53067..f9e4a27aed67 100644
> > > --- a/include/linux/guest_memfd.h
> > > +++ b/include/linux/guest_memfd.h
> > > @@ -25,6 +25,14 @@ struct guest_memfd_operations {
> > >  	int (*release)(struct inode *inode);
> > >  };
> > >  
> > > +/**
> > > + * @GUEST_MEMFD_FLAG_NO_DIRECT_MAP: When making folios inaccessible by host, also
> > > + *                                  remove them from the kernel's direct map.
> > > + */
> > > +enum {
> > 
> > please name this enum, otherwise kernel-doc wont' be happy
> > 
> > > +	GUEST_MEMFD_FLAG_NO_DIRECT_MAP		= BIT(0),
> > > +};
> > > +
> > >  /**
> > >   * @GUEST_MEMFD_GRAB_UPTODATE: Ensure pages are zeroed/up to date.
> > >   *                             If trusted hyp will do it, can ommit this flag
> > > diff --git a/mm/guest_memfd.c b/mm/guest_memfd.c
> > > index 580138b0f9d4..e9d8cab72b28 100644
> > > --- a/mm/guest_memfd.c
> > > +++ b/mm/guest_memfd.c
> > > @@ -7,9 +7,55 @@
> > >  #include <linux/falloc.h>
> > >  #include <linux/guest_memfd.h>
> > >  #include <linux/pagemap.h>
> > > +#include <linux/set_memory.h>
> > > +
> > > +static inline int guest_memfd_folio_private(struct folio *folio)
> > > +{
> > > +	unsigned long nr_pages = folio_nr_pages(folio);
> > > +	unsigned long i;
> > > +	int r;
> > > +
> > > +	for (i = 0; i < nr_pages; i++) {
> > > +		struct page *page = folio_page(folio, i);
> > > +
> > > +		r = set_direct_map_invalid_noflush(page);
> > > +		if (r < 0)
> > > +			goto out_remap;
> > > +	}
> > > +
> > > +	folio_set_private(folio);
> > > +	return 0;
> > > +out_remap:
> > > +	for (; i > 0; i--) {
> > > +		struct page *page = folio_page(folio, i - 1);
> > > +
> > > +		BUG_ON(set_direct_map_default_noflush(page));
> > > +	}
> > > +	return r;
> > > +}
> > > +
> > > +static inline void guest_memfd_folio_clear_private(struct folio *folio)
> > > +{
> > > +	unsigned long start = (unsigned long)folio_address(folio);
> > > +	unsigned long nr = folio_nr_pages(folio);
> > > +	unsigned long i;
> > > +
> > > +	if (!folio_test_private(folio))
> > > +		return;
> > > +
> > > +	for (i = 0; i < nr; i++) {
> > > +		struct page *page = folio_page(folio, i);
> > > +
> > > +		BUG_ON(set_direct_map_default_noflush(page));
> > > +	}
> > > +	flush_tlb_kernel_range(start, start + folio_size(folio));
> > 
> > I think that TLB flush should come after removing pages from the direct map
> > rather than after adding them back.
> > 
> 
> Gunyah flushes the tlb when it removes the stage 2 mapping, so we
> skipped it on removal as a performance optimization. I remember seeing
> that pKVM does the same (tlb flush for the stage 2 unmap & the
> equivalent for x86). Patrick had also done the same in their patches.

Strictly from the API perspective, unmapping the pages from the direct map
would imply removing potentially stale TLB entries.
If all currently anticipated users do it elsewhere, at the very least there
should be a huge bold comment.

And what's the point of tlb flush after setting the direct map to default?
There should not be stale tlb entries for the unmapped pages.
 
> Thanks,
> Elliot

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ