lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 15 May 2024 11:47:02 +0100
From: Suzuki K Poulose <suzuki.poulose@....com>
To: Catalin Marinas <catalin.marinas@....com>,
 Steven Price <steven.price@....com>
Cc: kvm@...r.kernel.org, kvmarm@...ts.linux.dev, Marc Zyngier
 <maz@...nel.org>, Will Deacon <will@...nel.org>,
 James Morse <james.morse@....com>, Oliver Upton <oliver.upton@...ux.dev>,
 Zenghui Yu <yuzenghui@...wei.com>, linux-arm-kernel@...ts.infradead.org,
 linux-kernel@...r.kernel.org, Joey Gouly <joey.gouly@....com>,
 Alexandru Elisei <alexandru.elisei@....com>,
 Christoffer Dall <christoffer.dall@....com>, Fuad Tabba <tabba@...gle.com>,
 linux-coco@...ts.linux.dev,
 Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>
Subject: Re: [PATCH v2 09/14] arm64: Enable memory encrypt for Realms

On 14/05/2024 19:00, Catalin Marinas wrote:
> On Fri, Apr 12, 2024 at 09:42:08AM +0100, Steven Price wrote:
>>   static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
>> @@ -41,6 +45,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
>>   	pte = clear_pte_bit(pte, cdata->clear_mask);
>>   	pte = set_pte_bit(pte, cdata->set_mask);
>>   
>> +	/* TODO: Break before make for PROT_NS_SHARED updates */
>>   	__set_pte(ptep, pte);
>>   	return 0;
> 
> Oh, this TODO is problematic, not sure we can do it safely. There are
> some patches on the list to trap faults from other CPUs if they happen
> to access the page when broken but so far we pushed back as complex and
> at risk of getting the logic wrong.
> 
>  From an architecture perspective, you are changing the output address
> and D8.16.1 requires a break-before-make sequence (FEAT_BBM doesn't
> help). So we either come up with a way to do BMM safely (stop_machine()
> maybe if it's not too expensive or some way to guarantee no accesses to
> this page while being changed) or we get the architecture clarified on
> the possible side-effects here ("unpredictable" doesn't help).

Thanks, we need to sort this out.


> 
>>   }
>> @@ -192,6 +197,43 @@ int set_direct_map_default_noflush(struct page *page)
>>   				   PAGE_SIZE, change_page_range, &data);
>>   }
>>   
>> +static int __set_memory_encrypted(unsigned long addr,
>> +				  int numpages,
>> +				  bool encrypt)
>> +{
>> +	unsigned long set_prot = 0, clear_prot = 0;
>> +	phys_addr_t start, end;
>> +
>> +	if (!is_realm_world())
>> +		return 0;
>> +
>> +	WARN_ON(!__is_lm_address(addr));
> 
> Just return from this function if it's not a linear map address. No
> point in corrupting other areas since __virt_to_phys() will get it
> wrong.
> 
>> +	start = __virt_to_phys(addr);
>> +	end = start + numpages * PAGE_SIZE;
>> +
>> +	if (encrypt) {
>> +		clear_prot = PROT_NS_SHARED;
>> +		set_memory_range_protected(start, end);
>> +	} else {
>> +		set_prot = PROT_NS_SHARED;
>> +		set_memory_range_shared(start, end);
>> +	}
>> +
>> +	return __change_memory_common(addr, PAGE_SIZE * numpages,
>> +				      __pgprot(set_prot),
>> +				      __pgprot(clear_prot));
>> +}
> 
> Can someone summarise what the point of this protection bit is? The IPA
> memory is marked as protected/unprotected already via the RSI call and
> presumably the RMM disables/permits sharing with a non-secure hypervisor
> accordingly irrespective of which alias the realm guest has the linear
> mapping mapped to. What does it do with the top bit of the IPA? Is it
> that the RMM will prevent (via Stage 2) access if the IPA does not match
> the requested protection? IOW, it unmaps one or the other at Stage 2?

The Realm's IPA space is split in half. The lower half is "protected"
and all pages backing the "protected" IPA is in the Realm world and
thus cannot be shared with the hypervisor. The upper half IPA is
"unprotected" (backed by Non-secure PAS pages) and can be accessed
by the Host/hyp.

The RSI call (RSI_IPA_STATE_SET) doesn't make an IPA unprotected. It
simply "invalidates" a (protected) IPA to "EMPTY" implying the Realm 
doesn't intend to use the "ipa" as RAM anymore and any access to it from
the Realm would trigger an SEA into the Realm. The RSI call triggers an 
exit to the host with the information and is a hint to the hypervisor to 
reclaim the page backing the IPA.

Now, given we need dynamic "sharing" of pages (instead of a dedicated
set of shared pages), "aliasing" of an IPA gives us shared pages.
i.e., If OS wants to share a page "x" (protected IPA) with the host,
we mark that as EMPTY via RSI call and then access the "x" with top-bit
set (aliasing the IPA x). This fault allows the hyp to map the page 
backing IPA "x" as "unprotected" at ALIAS(x) address.

Thus we treat the "top" bit as an attribute in the Realm.

> 
> Also, the linear map is not the only one that points to this IPA. What
> if this is a buffer mapped in user-space or remapped as non-cacheable
> (upgraded to cacheable via FWB) in the kernel, the code above does not
> (and cannot) change the user mappings.
> 
> It needs some digging into dma_direct_alloc() as well, it uses a
> pgprot_decrypted() but that's not implemented by your patches. Not sure
> it helps, it looks like the remap path in this function does not have a
> dma_set_decrypted() call (or maybe I missed it).

Good point. Will take a look.

Suzuki



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ