linux-kernel - Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <380c9748-1c86-4763-ea18-b884280a3b60@amd.com>
Date:   Fri, 14 Oct 2022 15:00:09 -0500
From:   "Kalra, Ashish" <ashish.kalra@....com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
        jroedel@...e.de, thomas.lendacky@....com, hpa@...or.com,
        ardb@...nel.org, pbonzini@...hat.com, seanjc@...gle.com,
        vkuznets@...hat.com, jmattson@...gle.com, luto@...nel.org,
        dave.hansen@...ux.intel.com, slp@...hat.com, pgonda@...gle.com,
        peterz@...radead.org, srinivas.pandruvada@...ux.intel.com,
        rientjes@...gle.com, dovmurik@...ux.ibm.com, tobin@....com,
        michael.roth@....com, vbabka@...e.cz, kirill@...temov.name,
        ak@...ux.intel.com, tony.luck@...el.com, marcorr@...gle.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, alpergun@...gle.com,
        dgilbert@...hat.com, jarkko@...nel.org
Subject: Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR
 allocation when SNP is enabled

Hello Boris,

On 10/13/2022 10:15 AM, Borislav Petkov wrote:
> On Mon, Jun 20, 2022 at 11:05:01PM +0000, Ashish Kalra wrote:
>> +static void snp_leak_pages(unsigned long pfn, unsigned int npages)
> 
> That function name looks wrong.
> 
>> +{
>> +	WARN(1, "psc failed, pfn 0x%lx pages %d (leaking)\n", pfn, npages);
>> +	while (npages--) {
>> +		memory_failure(pfn, 0);
> 		^^^^^^^^^^^^^^^^^^^^^^
> 
> Why?

The page was in FW state and we couldn't transition it back to HV/Shared
state, any access to this page can cause RMP #PF.

> 
>   * This function is called by the low level machine check code
>   * of an architecture when it detects hardware memory corruption
>   * of a page. It tries its best to recover, which includes
>   * dropping pages, killing processes etc.
> 
> I don't think you wanna do that.
> 
> It looks like you want to prevent the page from being used again but not
> mark it as PG_hwpoison and whatnot. PG_reserved perhaps?

  * PG_reserved is set for special pages. The "struct page" of such a
  * page should in general not be touched (e.g. set dirty) except by its
  * owner.

If it is "still" accessed/touched then it can cause RMP #PF.
On the other hand,

  * PG_hwpoison... Accessing is
  * not safe since it may cause another machine check. Don't touch!

That sounds exactly the state we want these page(s) to be in ?

Another possibility is PG_error.

> 
>> +		dump_rmpentry(pfn);
>> +		pfn++;
>> +	}
>> +}
>> +
>> +static int snp_reclaim_pages(unsigned long pfn, unsigned int npages, bool locked)
>> +{
>> +	struct sev_data_snp_page_reclaim data;
>> +	int ret, err, i, n = 0;
>> +
>> +	for (i = 0; i < npages; i++) {
>> +		memset(&data, 0, sizeof(data));
>> +		data.paddr = pfn << PAGE_SHIFT;
> 
> Oh wow, this is just silly. A struct for a single u64. Just use a
> 
> 	u64 paddr;
Ok.
> 
> directly. But we had this topic already...
> 
>> +
>> +		if (locked)
> 
> Ew, that's never a good design - conditional locking.

There is a previous suggestion to change `sev_cmd_mutex` to some sort of 
nesting lock type to clean up this if (locked) code, though AFAIK, there 
is no support for nesting lock types.

> 
>> +			ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
>> +		else
>> +			ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> 
> <---- newline here.
> 
>> +		if (ret)
>> +			goto cleanup;
>> +
>> +		ret = rmp_make_shared(pfn, PG_LEVEL_4K);
>> +		if (ret)
>> +			goto cleanup;
>> +
>> +		pfn++;
>> +		n++;
>> +	}
>> +
>> +	return 0;
>> +
>> +cleanup:
>> +	/*
>> +	 * If failed to reclaim the page then page is no longer safe to
>> +	 * be released, leak it.
>> +	 */
>> +	snp_leak_pages(pfn, npages - n);
> 
> So this looks real weird: we go and reclaim pages, we hit an error
> during reclaiming a page X somewhere in-between and then we go and mark
> the *remaining* pages as not to be used?!
> 
> Why?
> 
> Why not only that *one* page which failed and then we continue with the
> rest?!

I agree and will change to this approach.

> 
>> +	return ret;
>> +}
>> +
>> +static inline int rmp_make_firmware(unsigned long pfn, int level)
>> +{
>> +	return rmp_make_private(pfn, 0, level, 0, true);
>> +}
> 
> That's a silly wrapper used only once. Just do at the callsite:
> 
> 	/* Mark this page as belonging to firmware */
> 	rc = rmp_make_private(pfn, 0, level, 0, true);
> 
Ok.

>> +
>> +static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked,
>> +			     bool need_reclaim)
> 
> Tangential to the above, this is just nuts with those bool arguments.
> Just look at the callsites: do you understand what they do?
> 
> 	snp_set_rmp_state(paddr, npages, true, locked, false);
> 
> what does that do? You need to go up to the definition of the function,
> count the arguments and see what that "true" arg stands for.

I totally agree, this is simply unreadable.

And this has been mentioned previously too ...
This function can do a lot and when I read the call sites its hard to
see what its doing since we have a combination of arguments which tell
us what behavior is happening ...

> 
> What you should do instead is, have separate helpers which do only one
> thing:
> 
> 	rmp_mark_pages_firmware();
> 	rmp_mark_pages_shared();
> 	rmp_mark_pages_...
> 
> and then have the *callers* issue snp_reclaim_pages() when needed. So you'd have
> 
> 	rmp_mark_pages_firmware();
> 	rmp_mark_pages_shared()
> 
> and __snp_free_firmware_pages() would do
> 
> 	rmp_mark_pages_shared();
> 	snp_reclaim_pages();
> 
Actually, this only needs to call snp_reclaim_pages().

> and so on.
> 
> And then if you need locking, the callers can decide which sev_do_cmd
> variant to issue.
> 
> And then if you have common code fragments which you can unify into a
> bigger helper function, *then* you can do that.
> 
> Instead of multiplexing it this way. Which makes it really hard to
> follow what the code does.
> 

Sure i will do this cleanup.

> 
>> +	unsigned long pfn = __sme_clr(paddr) >> PAGE_SHIFT; /* Cbit maybe set in the paddr */
> 
> No side comments pls.
> 
>> +	int rc, n = 0, i;
>> +
>> +	for (i = 0; i < npages; i++) {
>> +		if (to_fw)
>> +			rc = rmp_make_firmware(pfn, PG_LEVEL_4K);
>> +		else
>> +			rc = need_reclaim ? snp_reclaim_pages(pfn, 1, locked) :
>> +					    rmp_make_shared(pfn, PG_LEVEL_4K);
>> +		if (rc)
>> +			goto cleanup;
>> +
>> +		pfn++;
>> +		n++;
>> +	}
>> +
>> +	return 0;
>> +
>> +cleanup:
>> +	/* Try unrolling the firmware state changes */
>> +	if (to_fw) {
>> +		/*
>> +		 * Reclaim the pages which were already changed to the
>> +		 * firmware state.
>> +		 */
>> +		snp_reclaim_pages(paddr >> PAGE_SHIFT, n, locked);
>> +
>> +		return rc;
>> +	}
>> +
>> +	/*
>> +	 * If failed to change the page state to shared, then its not safe
>> +	 * to release the page back to the system, leak it.
>> +	 */
>> +	snp_leak_pages(pfn, npages - n);
>> +
>> +	return rc;
>> +}
> 
> ...
> 
>> +void snp_free_firmware_page(void *addr)
>> +{
>> +	if (!addr)
>> +		return;
>> +
>> +	__snp_free_firmware_pages(virt_to_page(addr), 0, false);
>> +}
>> +EXPORT_SYMBOL(snp_free_firmware_page);
> 
> EXPORT_SYMBOL_GPL() ofc.
> 

Thanks,
Ashish