linux-kernel - Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cffed3c2-55a9-bdd3-3b8a-82b2050a64af@amd.com>
Date:   Mon, 21 Nov 2022 18:37:18 -0600
From:   "Kalra, Ashish" <ashish.kalra@....com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
        jroedel@...e.de, thomas.lendacky@....com, hpa@...or.com,
        ardb@...nel.org, pbonzini@...hat.com, seanjc@...gle.com,
        vkuznets@...hat.com, jmattson@...gle.com, luto@...nel.org,
        dave.hansen@...ux.intel.com, slp@...hat.com, pgonda@...gle.com,
        peterz@...radead.org, srinivas.pandruvada@...ux.intel.com,
        rientjes@...gle.com, dovmurik@...ux.ibm.com, tobin@....com,
        michael.roth@....com, vbabka@...e.cz, kirill@...temov.name,
        ak@...ux.intel.com, tony.luck@...el.com, marcorr@...gle.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, alpergun@...gle.com,
        dgilbert@...hat.com, jarkko@...nel.org
Subject: Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR
 allocation when SNP is enabled

Hello Boris,

On 11/20/2022 3:34 PM, Borislav Petkov wrote:
> On Thu, Nov 17, 2022 at 02:56:47PM -0600, Kalra, Ashish wrote:
>> So we need to be able to reclaim all the pages or none.
> 
> /me goes and looks at SNP_PAGE_RECLAIM's retvals:
> 
> - INVALID_PLATFORM_STATE - platform is not in INIT state. That's
> certainly not a reason to leak pages.

This should not happen, as there are sev->snp_initialized checks before
any firmware page allocation or snp page transitions.

> 
> - INVALID_ADDRESS - PAGE_PADDR is not a valid system physical address.
> That's botched command buffer but not a broken page so no reason to leak
> them either.
> 
> - INVALID_PAGE_STATE - the page is neither of those types: metadata,
> firmware, pre-guest nor pre-swap. So if you issue page reclaim on the
> wrong range of pages that looks again like a user error but no need to
> leak pages.
> 
> - INVALID_PAGE_SIZE - a size mismatch. Still sounds to me like a user
> error of sev-guest instead of anything wrong deeper in the FW or HW.
> 
> So in all those, if you end up supplying the wrong range of addresses,
> you most certainly will end up leaking the wrong pages.
> 
> So it sounds to me like you wanna say: "Error reclaiming range, check
> your driver" instead of punishing any innocent pages.

I agree, but these pages are not in the right state to be released back 
to the system or accessed by the host, because they have already been 
transitioned successfully to firmware state and the reclaim has failed. 
If we release them back to page-allocator and whenever the host accesses 
them, it will get a not-present #PF and it will panic/crash the host 
process.

It might be a user/sev-guest error, but these pages are now unsafe to 
use. So is a kernel panic justified here, instead of not releasing the 
pages back to host and logging errors for the same.

Thanks,
Ashish

> 
> Now, if the retval from the fw were FIRMWARE_INTERNAL_ERROR or so, then
> sure, by all means. But not for the above. All the error conditions
> above sound like the kernel has supplied the wrong range/botched command
> buffer to the firmware so there's no need to leak pages.
> 
> Thx.
>