linux-kernel - Re: [PATCH RFC gmem v1 4/8] KVM: x86: Add gmem hook for invalidating memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZfC6gnqVhZQJnB_3@google.com>
Date: Tue, 12 Mar 2024 13:26:42 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Michael Roth <michael.roth@....com>
Cc: Steven Price <steven.price@....com>, kvm@...r.kernel.org, 
	Suzuki K Poulose <suzuki.poulose@....com>, "tabba@...gle.com" <tabba@...gle.com>, linux-coco@...ts.linux.dev, 
	linux-mm@...ck.org, linux-crypto@...r.kernel.org, x86@...nel.org, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	pbonzini@...hat.com, isaku.yamahata@...el.com, ackerleytng@...gle.com, 
	vbabka@...e.cz, ashish.kalra@....com, nikunj.dadhania@....com, 
	jroedel@...e.de, pankaj.gupta@....com
Subject: Re: [PATCH RFC gmem v1 4/8] KVM: x86: Add gmem hook for invalidating memory

On Mon, Mar 11, 2024, Michael Roth wrote:
> On Fri, Feb 09, 2024 at 07:13:13AM -0800, Sean Christopherson wrote:
> > On Fri, Feb 09, 2024, Steven Price wrote:
> > > >> One option that I've considered is to implement a seperate CCA ioctl to
> > > >> notify KVM whether the memory should be mapped protected.
> > > > 
> > > > That's what KVM_SET_MEMORY_ATTRIBUTES+KVM_MEMORY_ATTRIBUTE_PRIVATE is for, no?
> > > 
> > > Sorry, I really didn't explain that well. Yes effectively this is the
> > > attribute flag, but there's corner cases for destruction of the VM. My
> > > thought was that if the VMM wanted to tear down part of the protected
> > > range (without making it shared) then a separate ioctl would be needed
> > > to notify KVM of the unmap.
> > 
> > No new uAPI should be needed, because the only scenario time a benign VMM should
> > do this is if the guest also knows the memory is being removed, in which case
> > PUNCH_HOLE will suffice.
> > 
> > > >> This 'solves' the problem nicely except for the case where the VMM
> > > >> deliberately punches holes in memory which the guest is using.
> > > > 
> > > > I don't see what problem there is to solve in this case.  PUNCH_HOLE is destructive,
> > > > so don't do that.
> > > 
> > > A well behaving VMM wouldn't PUNCH_HOLE when the guest is using it, but
> > > my concern here is a VMM which is trying to break the host. In this case
> > > either the PUNCH_HOLE needs to fail, or we actually need to recover the
> > > memory from the guest (effectively killing the guest in the process).
> > 
> > The latter.  IIRC, we talked about this exact case somewhere in the hour-long
> > rambling discussion on guest_memfd at PUCK[1].  And we've definitely discussed
> > this multiple times on-list, though I don't know that there is a single thread
> > that captures the entire plan.
> > 
> > The TL;DR is that gmem will invoke an arch hook for every "struct kvm_gmem"
> > instance that's attached to a given guest_memfd inode when a page is being fully
> > removed, i.e. when a page is being freed back to the normal memory pool.  Something
> > like this proposed SNP patch[2].
> > 
> > Mike, do have WIP patches you can share?
> 
> Sorry, I missed this query earlier. I'm a bit confused though, I thought
> the kvm_arch_gmem_invalidate() hook provided in this patch was what we
> ended up agreeing on during the PUCK call in question.

Heh, I trust your memory of things far more than I trust mine.  I'm just proving
Cunningham's Law.  :-)

> There was an open question about what to do if a use-case came along
> where we needed to pass additional parameters to
> kvm_arch_gmem_invalidate() other than just the start/end PFN range for
> the pages being freed, but we'd determined that SNP and TDX did not
> currently need this, so I didn't have any changes planned in this
> regard.
> 
> If we now have such a need, what we had proposed was to modify
> __filemap_remove_folio()/page_cache_delete() to defer setting
> folio->mapping to NULL so that we could still access it in
> kvm_gmem_free_folio() so that we can still access mapping->i_private_list
> to get the list of gmem/KVM instances and pass them on via
> kvm_arch_gmem_invalidate().

Yeah, this is what I was remembering.  I obviously forgot that we didn't have a
need to iterate over all bindings at this time.

> So that's doable, but it's not clear from this discussion that that's
> needed.

Same here.  And even if it is needed, it's not your problem to solve.  The above
blurb about needing to preserve folio->mapping being free_folio() is sufficient
to get the ARM code moving in the right direction.

Thanks!

> If the idea to block/kill the guest if VMM tries to hole-punch,
> and ARM CCA already has plans to wire up the shared/private flags in
> kvm_unmap_gfn_range(), wouldn't that have all the information needed to
> kill that guest? At that point, kvm_gmem_free_folio() can handle
> additional per-page cleanup (with additional gmem/KVM info plumbed in
> if necessary).