lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 21 May 2024 15:58:21 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Catalin Marinas <catalin.marinas@....com>
CC: Suzuki K Poulose <suzuki.poulose@....com>, Steven Price
	<steven.price@....com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>, Marc Zyngier
	<maz@...nel.org>, Will Deacon <will@...nel.org>, James Morse
	<james.morse@....com>, Oliver Upton <oliver.upton@...ux.dev>, Zenghui Yu
	<yuzenghui@...wei.com>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Joey Gouly <joey.gouly@....com>, Alexandru
 Elisei <alexandru.elisei@....com>, Christoffer Dall
	<christoffer.dall@....com>, Fuad Tabba <tabba@...gle.com>,
	"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, Ganapatrao
 Kulkarni <gankulkarni@...amperecomputing.com>
Subject: RE: [PATCH v2 09/14] arm64: Enable memory encrypt for Realms

From: Catalin Marinas <catalin.marinas@....com>Sent: Tuesday, May 21, 2024 3:14 AM
> 
> On Mon, May 20, 2024 at 08:32:43PM +0000, Michael Kelley wrote:
> > From: Catalin Marinas <catalin.marinas@....com> Sent: Monday, May 20, 2024 9:53 AM
> > > > > On Fri, Apr 12, 2024 at 09:42:08AM +0100, Steven Price wrote:
> > > > > >   static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
> > > > > > @@ -41,6 +45,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
> > > > > >   	pte = clear_pte_bit(pte, cdata->clear_mask);
> > > > > >   	pte = set_pte_bit(pte, cdata->set_mask);
> > > > > > +	/* TODO: Break before make for PROT_NS_SHARED updates */
> > > > > >   	__set_pte(ptep, pte);
> > > > > >   	return 0;
> [...]
> > > Thanks for the clarification on RIPAS states and behaviour in one of
> > > your replies. Thinking about this, since the page is marked as
> > > RIPAS_EMPTY prior to changing the PTE, the address is going to fault
> > > anyway as SEA if accessed. So actually breaking the PTE, TLBI, setting
> > > the new PTE would not add any new behaviour. Of course, this assumes
> > > that set_memory_decrypted() is never called on memory being currently
> > > accessed (can we guarantee this?).
> >
> > While I worked on CoCo VM support on Hyper-V for x86 -- both AMD
> > SEV-SNP and Intel TDX, I haven't ramped up on the ARM64 CoCo
> > VM architecture yet.  With that caveat in mind, the assumption is that callers
> > of set_memory_decrypted() and set_memory_encrypted() ensure that
> > the target memory isn't currently being accessed.   But there's a big
> > exception:  load_unaligned_zeropad() can generate accesses that the
> > caller can't control.  If load_unaligned_zeropad() touches a page that is
> > in transition between decrypted and encrypted, a SEV-SNP or TDX architectural
> > fault could occur.  On x86, those fault handlers detect this case, and
> > fix things up.  The Hyper-V case requires a different approach, and marks
> > the PTEs as "not present" before initiating a transition between decrypted
> > and encrypted, and marks the PTEs "present" again after the transition.
> 
> Thanks. The load_unaligned_zeropad() case is a good point. I thought
> we'd get away with this on arm64 since accessing such decrypted page
> would trigger a synchronous exception but looking at the code, the
> do_sea() path never calls fixup_exception(), so we just kill the whole
> kernel.
> 
> > This approach causes a reference generated by load_unaligned_zeropad()
> > to take the normal page fault route, and use the page-fault-based fixup for
> > load_unaligned_zeropad(). See commit 0f34d11234868 for the Hyper-V case.
> 
> I think for arm64 set_memory_decrypted() (and encrypted) would have to
> first make the PTE invalid, TLBI, set the RIPAS_EMPTY state, set the new
> PTE. Any page fault due to invalid PTE would be handled by the exception
> fixup in load_unaligned_zeropad(). This way we wouldn't get any
> synchronous external abort (SEA) in standard uses. Not sure we need to
> do anything hyper-v specific as in the commit above.

Sounds good to me. I tried to do the same for all the x86 cases (instead of
just handling the Hyper-V paravisor), since that would completely decouple
TDX/SEV-SNP from load_unaligned_zeropad(). It worked for TDX. But
SEV-SNP does the PVALIDATE instruction during a decrypted<->encrypted
transition, and PVALIDATE inconveniently requires the virtual address as
input. It only uses the vaddr to translate to the paddr, but with the vaddr
PTE "not present", PVALIDATE fails. Sigh. This problem will probably come
back again when/if Coconut or any other paravisor redirects #VC/#VE to
the paravisor. But I disgress ....

> 
> > > (I did come across the hv_uio_probe() which, if I read correctly, it
> > > ends up calling set_memory_decrypted() with a vmalloc() address; let's
> > > pretend this code doesn't exist ;))
> >
> > While the Hyper-V UIO driver is perhaps a bit of an outlier, the Hyper-V
> > netvsc driver also does set_memory_decrypted() on 16 Mbyte vmalloc()
> > allocations, and there's not really a viable way to avoid this. The
> > SEV-SNP and TDX code handles this case.   Support for this case will
> > probably also be needed for CoCo guests on Hyper-V on ARM64.
> 
> Ah, I was hoping we can ignore it. So the arm64 set_memory_*() code will
> have to detect and change both the vmalloc map and the linear map.

Yep.

> Currently this patchset assumes the latter only.
> 
> Such buffers may end up in user space as well but I think at the
> set_memory_decrypted() call there aren't any such mappings and
> subsequent remap_pfn_range() etc. would handle the permissions properly
> through the vma->vm_page_prot attributes (assuming that someone set
> those pgprot attributes).

Yes, I'm pretty sure that's what we've seen on the x86 side.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ