[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN6PR12MB276722570164ECD120BA4D628EB39@SN6PR12MB2767.namprd12.prod.outlook.com>
Date: Tue, 21 Jun 2022 20:17:15 +0000
From: "Kalra, Ashish" <Ashish.Kalra@....com>
To: Peter Gonda <pgonda@...gle.com>
CC: the arch/x86 maintainers <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
kvm list <kvm@...r.kernel.org>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
"Lendacky, Thomas" <Thomas.Lendacky@....com>,
"H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Jim Mattson <jmattson@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sergio Lopez <slp@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dov Murik <dovmurik@...ux.ibm.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Borislav Petkov <bp@...en8.de>,
"Roth, Michael" <Michael.Roth@....com>,
Vlastimil Babka <vbabka@...e.cz>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Andi Kleen <ak@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>, Marc Orr <marcorr@...gle.com>,
Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
Alper Gun <alpergun@...gle.com>,
"Dr. David Alan Gilbert" <dgilbert@...hat.com>,
"jarkko@...nel.org" <jarkko@...nel.org>
Subject: RE: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR
allocation when SNP is enabled
[Public]
Hello Peter,
>> +static int snp_reclaim_pages(unsigned long pfn, unsigned int npages,
>> +bool locked) {
>> + struct sev_data_snp_page_reclaim data;
>> + int ret, err, i, n = 0;
>> +
>> + for (i = 0; i < npages; i++) {
>What about setting |n| here too, also the other increments.
>for (i = 0, n = 0; i < npages; i++, n++, pfn++)
Yes that is simpler.
>> + memset(&data, 0, sizeof(data));
>> + data.paddr = pfn << PAGE_SHIFT;
>> +
>> + if (locked)
>> + ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
>> + else
>> + ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM,
>> + &data, &err);
> Can we change `sev_cmd_mutex` to some sort of nesting lock type? That could clean up this if (locked) code.
> +static inline int rmp_make_firmware(unsigned long pfn, int level) {
> + return rmp_make_private(pfn, 0, level, 0, true); }
> +
> +static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked,
> + bool need_reclaim)
>This function can do a lot and when I read the call sites its hard to see what its doing since we have a combination of arguments which tell us what behavior is happening, some of which are not valid (ex: to_fw == true and need_reclaim == true is an >invalid argument combination).
to_fw is used to make a firmware page and need_reclaim is for freeing the firmware page, so they are going to be mutually exclusive.
I actually can connect with it quite logically with the callers :
snp_alloc_firmware_pages will call with to_fw = true and need_reclaim = false
and snp_free_firmware_pages will do the opposite, to_fw = false and need_reclaim = true.
That seems straightforward to look at.
>Also this for loop over |npages| is duplicated from snp_reclaim_pages(). One improvement here is that on the current
>snp_reclaim_pages() if we fail to reclaim a page we assume we cannot reclaim the next pages, this may cause us to snp_leak_pages() more pages than we actually need too.
Yes that is true.
>What about something like this?
>static snp_leak_page(u64 pfn, enum pg_level level) {
> memory_failure(pfn, 0);
> dump_rmpentry(pfn);
>}
>static int snp_reclaim_page(u64 pfn, enum pg_level level) {
> int ret;
> struct sev_data_snp_page_reclaim data;
> ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> if (ret)
> goto cleanup;
> ret = rmp_make_shared(pfn, level);
> if (ret)
> goto cleanup;
> return 0;
>cleanup:
> snp_leak_page(pfn, level)
>}
>typedef int (*rmp_state_change_func) (u64 pfn, enum pg_level level);
>static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, rmp_state_change_func state_change, rmp_state_change_func cleanup) {
> struct sev_data_snp_page_reclaim data;
> int ret, err, i, n = 0;
> for (i = 0, n = 0; i < npages; i++, n++, pfn++) {
> ret = state_change(pfn, PG_LEVEL_4K)
> if (ret)
> goto cleanup;
> }
> return 0;
> cleanup:
> for (; i>= 0; i--, n--, pfn--) {
> cleanup(pfn, PG_LEVEL_4K);
> }
> return ret;
>}
>Then inside of __snp_alloc_firmware_pages():
>snp_set_rmp_state(paddr, npages, rmp_make_firmware, snp_reclaim_page);
>And inside of __snp_free_firmware_pages():
>snp_set_rmp_state(paddr, npages, snp_reclaim_page, snp_leak_page);
>Just a suggestion feel free to ignore. The readability comment could be addressed much less invasively by just making separate functions for each valid combination of arguments here. Like snp_set_rmp_fw_state(), snp_set_rmp_shared_state(),
>snp_set_rmp_release_state() or something.
>> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int
>> +order, bool locked) {
>> + unsigned long npages = 1ul << order, paddr;
>> + struct sev_device *sev;
>> + struct page *page;
>> +
>> + if (!psp_master || !psp_master->sev_data)
>> + return NULL;
>> +
>> + page = alloc_pages(gfp_mask, order);
>> + if (!page)
>> + return NULL;
>> +
>> + /* If SEV-SNP is initialized then add the page in RMP table. */
>> + sev = psp_master->sev_data;
>> + if (!sev->snp_inited)
>> + return page;
>> +
>> + paddr = __pa((unsigned long)page_address(page));
>> + if (snp_set_rmp_state(paddr, npages, true, locked, false))
>> + return NULL;
>So what about the case where snp_set_rmp_state() fails but we were able to reclaim all the pages? Should we be able to signal that to callers so that we could free |page| here? But given this is an error path already maybe we can optimize this in a >follow up series.
Yes, we should actually tie in to snp_reclaim_pages() success or failure here in the case we were able to successfully unroll some or all of the firmware state change.
> +
> + return page;
> +}
> +
> +void *snp_alloc_firmware_page(gfp_t gfp_mask) {
> + struct page *page;
> +
> + page = __snp_alloc_firmware_pages(gfp_mask, 0, false);
> +
> + return page ? page_address(page) : NULL; }
> +EXPORT_SYMBOL_GPL(snp_alloc_firmware_page);
> +
> +static void __snp_free_firmware_pages(struct page *page, int order,
> +bool locked) {
> + unsigned long paddr, npages = 1ul << order;
> +
> + if (!page)
> + return;
> +
> + paddr = __pa((unsigned long)page_address(page));
> + if (snp_set_rmp_state(paddr, npages, false, locked, true))
> + return;
> Here we may be able to free some of |page| depending how where inside of snp_set_rmp_state() we failed. But again given this is an error path already maybe we can optimize this in a follow up series.
Yes, we probably should be able to free some of the page(s) depending on how many page(s) got reclaimed in snp_set_rmp_state().
But these reclamation failures may not be very common, so any failure is indicative of a bigger issue, it might be the case when there is a single page reclamation error it might happen with all the subsequent
pages and so follow a simple recovery procedure, then handling a more complex recovery for a chunk of pages being reclaimed and another chunk not.
Thanks,
Ashish
Powered by blists - more mailing lists