[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN6PR2101MB169309ACB83862DA6E572A89D7DDA@SN6PR2101MB1693.namprd21.prod.outlook.com>
Date: Thu, 26 Oct 2023 00:35:53 +0000
From: "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>,
"x86@...nel.org" <x86@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"hpa@...or.com" <hpa@...or.com>,
"luto@...nel.org" <luto@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"elena.reshetova@...el.com" <elena.reshetova@...el.com>,
"isaku.yamahata@...el.com" <isaku.yamahata@...el.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>,
Dexuan Cui <decui@...rosoft.com>,
"sathyanarayanan.kuppuswamy@...ux.intel.com"
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] x86/mm/cpa: Warn if set_memory_XXcrypted() fails
From: Rick Edgecombe <rick.p.edgecombe@...el.com> Sent: Tuesday, October 24, 2023 4:48 PM
>
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
I think you mean "shared" as indicated by the guest page tables (vs. "shared"
as the state of the page from the host standpoint). Some precision on
that distinction seems useful here and in follow-on patches to make callers'
error handling be correct. As I understand it, the premise is that if the
guest is accessing a page as private, and the host/VMM has messed
around with the page private/shared status, the confidentiality of the
VM is protected. The risk of leakage occurs when the guest is accessing
a page as shared, so kernel code must guard against putting memory
on the free list if the guest page tables are marked shared.
>
> Such errors may herald future system instability, but are temporarily
> survivable with proper handling in the caller. The kernel traditionally
> makes every effort to keep running, but it is expected that some coco
> guests may prefer to play it safe security-wise, and panic in this case.
> To accommodate both cases, warn when the arch breakouts for converting
> memory at the VMM layer return an error to CPA. Security focused users
> can rely on panic_on_warn to defend against bugs in the callers.
To me, this sentence doesn't fully characterize why panic_on_warn
would be used. You describe one reason, which is a caller that fails to
properly handle an error and incorrectly puts memory with a "shared"
guest PTE on the free list. But getting an error back also implies that
something unknown has gone wrong with the CoCo mechanism for
managing private vs. shared pages. Security focused users would not
take the risk of continuing to operate with that kind of unknown error
in the core mechanism of a CoCo VM.
>
> Since the arch breakouts host the logic for handling coco implementation
> specific errors, an error returned from them means that the set_memory()
> call is out of options for handling the error internally. Make this the
> condition to warn about.
>
> It is possible that very rarely these functions could fail due to guest
> memory pressure (in the case of failing to allocate a huge page when
> splitting a page table). Don't warn in this case because it is a lot less
> likely to indicate an attack by the host and it is not clear which
> set_memory() calls should get the same treatment. That corner should be
> addressed by future work that considers the more general problem and not
> just papers over a single set_memory() variant.
>
> Suggested-by: Michael Kelley (LINUX) <mikelley@...rosoft.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
> ---
> This is a followup to the "Handle set_memory_XXcrypted() errors"
> series[0].
>
> Previously[1] I attempted to create a useful helper to both simplify the
> callers and provide an official example of how to handle conversion
> errors. Dave pointed out that there wasn't actually any code savings in
> the callers using it. It also required a whole additional patch to make
> set_memory_XXcrypted() more robust.
>
> I tried to create some more sensible helper, but in the end gave up. My
> current plan is to just add a warning for VMM failures around this. And
> then shortly after, pursue open coded fixes for the callers that are
> problems for TDX. There are some SEV and SME specifics callers, that I am
> not sure on. But I'm under the impression that as long as that side
> terminates the guest on error, they should be harmless.
>
> [0] https://lore.kernel.org/lkml/20231017202505.340906-1-rick.p.edgecombe@intel.com/
> [1] https://lore.kernel.org/lkml/20231017202505.340906-2-rick.p.edgecombe@intel.com/
> ---
> arch/x86/mm/pat/set_memory.c | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index bda9f129835e..dade281f449b 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2153,7 +2153,7 @@ static int __set_memory_enc_pgtable(unsigned long addr,
> int numpages, bool enc)
>
> /* Notify hypervisor that we are about to set/clr encryption attribute. */
> if (!x86_platform.guest.enc_status_change_prepare(addr, numpages, enc))
> - return -EIO;
> + goto vmm_fail;
>
> ret = __change_page_attr_set_clr(&cpa, 1);
>
> @@ -2167,12 +2167,20 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
> cpa_flush(&cpa, 0);
>
> /* Notify hypervisor that we have successfully set/clr encryption attribute. */
> - if (!ret) {
> - if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
> - ret = -EIO;
> - }
> + if (ret)
> + goto out;
>
> + if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
> + goto vmm_fail;
> +
> +out:
> return ret;
> +
> +vmm_fail:
> + WARN_ONCE(1, "CPA VMM failure to convert memory (addr=%p, numpages=%d) to %s.\n",
> + (void *)addr, numpages, enc ? "private" : "shared");
I'm not sure about outputting the "addr" value. It could be
useful, but the %p format specifier hashes the value unless the
kernel is booted with "no_hash_pointers". Should %px be used
so the address is output unmodified?
> +
> + return -EIO;
> }
>
> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> --
> 2.34.1
My comments notwithstanding, I'm good with this overall change and
the additional level of protection it offers to CoCo VM users.
Michael
Powered by blists - more mailing lists