lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB41578D7BFEDE33BD2E8246EFD4E92@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Mon, 20 May 2024 20:32:43 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Catalin Marinas <catalin.marinas@....com>, Suzuki K Poulose
	<suzuki.poulose@....com>
CC: Steven Price <steven.price@....com>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
	Marc Zyngier <maz@...nel.org>, Will Deacon <will@...nel.org>, James Morse
	<james.morse@....com>, Oliver Upton <oliver.upton@...ux.dev>, Zenghui Yu
	<yuzenghui@...wei.com>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Joey Gouly <joey.gouly@....com>, Alexandru
 Elisei <alexandru.elisei@....com>, Christoffer Dall
	<christoffer.dall@....com>, Fuad Tabba <tabba@...gle.com>,
	"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, Ganapatrao
 Kulkarni <gankulkarni@...amperecomputing.com>
Subject: RE: [PATCH v2 09/14] arm64: Enable memory encrypt for Realms

From: Catalin Marinas <catalin.marinas@....com> Sent: Monday, May 20, 2024 9:53 AM
> 
> On Wed, May 15, 2024 at 11:47:02AM +0100, Suzuki K Poulose wrote:
> > On 14/05/2024 19:00, Catalin Marinas wrote:
> > > On Fri, Apr 12, 2024 at 09:42:08AM +0100, Steven Price wrote:
> > > >   static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
> > > > @@ -41,6 +45,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
> > > >   	pte = clear_pte_bit(pte, cdata->clear_mask);
> > > >   	pte = set_pte_bit(pte, cdata->set_mask);
> > > > +	/* TODO: Break before make for PROT_NS_SHARED updates */
> > > >   	__set_pte(ptep, pte);
> > > >   	return 0;
> > >
> > > Oh, this TODO is problematic, not sure we can do it safely. There are
> > > some patches on the list to trap faults from other CPUs if they happen
> > > to access the page when broken but so far we pushed back as complex and
> > > at risk of getting the logic wrong.
> > >
> > >  From an architecture perspective, you are changing the output address
> > > and D8.16.1 requires a break-before-make sequence (FEAT_BBM doesn't
> > > help). So we either come up with a way to do BMM safely (stop_machine()
> > > maybe if it's not too expensive or some way to guarantee no accesses to
> > > this page while being changed) or we get the architecture clarified on
> > > the possible side-effects here ("unpredictable" doesn't help).
> >
> > Thanks, we need to sort this out.
> 
> Thanks for the clarification on RIPAS states and behaviour in one of
> your replies. Thinking about this, since the page is marked as
> RIPAS_EMPTY prior to changing the PTE, the address is going to fault
> anyway as SEA if accessed. So actually breaking the PTE, TLBI, setting
> the new PTE would not add any new behaviour. Of course, this assumes
> that set_memory_decrypted() is never called on memory being currently
> accessed (can we guarantee this?).

While I worked on CoCo VM support on Hyper-V for x86 -- both AMD
SEV-SNP and Intel TDX, I haven't ramped up on the ARM64 CoCo
VM architecture yet.  With that caveat in mind, the assumption is that callers
of set_memory_decrypted() and set_memory_encrypted() ensure that
the target memory isn't currently being accessed.   But there's a big
exception:  load_unaligned_zeropad() can generate accesses that the
caller can't control.  If load_unaligned_zeropad() touches a page that is
in transition between decrypted and encrypted, a SEV-SNP or TDX architectural
fault could occur.  On x86, those fault handlers detect this case, and
fix things up.  The Hyper-V case requires a different approach, and marks
the PTEs as "not present" before initiating a transition between decrypted
and encrypted, and marks the PTEs "present" again after the transition.
This approach causes a reference generated by load_unaligned_zeropad() 
to take the normal page fault route, and use the page-fault-based fixup for
load_unaligned_zeropad(). See commit 0f34d11234868 for the Hyper-V case.

> 
> So, we need to make sure that there is no access to the linear map
> between set_memory_range_shared() and the actual pte update with
> __change_memory_common() in set_memory_decrypted(). At a quick look,
> this seems to be the case (ignoring memory scrubbers, though dummy ones
> just accessing memory are not safe anyway and unlikely to run in a
> guest).
> 
> (I did come across the hv_uio_probe() which, if I read correctly, it
> ends up calling set_memory_decrypted() with a vmalloc() address; let's
> pretend this code doesn't exist ;))

While the Hyper-V UIO driver is perhaps a bit of an outlier, the Hyper-V
netvsc driver also does set_memory_decrypted() on 16 Mbyte vmalloc()
allocations, and there's not really a viable way to avoid this. The
SEV-SNP and TDX code handles this case.   Support for this case will
probably also be needed for CoCo guests on Hyper-V on ARM64.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ