lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210527043832.3984374-1-sathyanarayanan.kuppuswamy@linux.intel.com>
Date:   Wed, 26 May 2021 21:38:32 -0700
From:   Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        linux-kernel@...r.kernel.org
Subject: [RFC v2-fix-v3 1/1] x86/tdx: Ignore WBINVD instruction for TDX guest

Functionally only devices outside the CPU (such as DMA devices,
or persistent memory for flushing) can notice the external side
effects from WBINVD's cache flushing for write back mappings. One
exception here is MKTME, but that is not visible outside the TDX
module and not possible inside a TDX guest.

Currently TDX does not support DMA, because DMA typically needs
uncached access for MMIO, and the current TDX module always sets
the IgnorePAT bit, which prevents that.

Persistent memory is also currently not supported. There are some
other cases that use WBINVD, such as the legacy ACPI sleeps, but
these are all not supported in virtualization and there are better
mechanisms inside a guest anyways. The guests usually are not
aware of power management. Another code path that uses WBINVD is
the MTRR driver, but EPT/virtualization always disables MTRRs so
those are not needed. This all implies WBINVD is not needed with
current TDX. 

So handle the WBINVD instruction as nop. Currently, #VE exception
handler does not include any warning for WBINVD handling because
ACPI reboot code uses it. This is the same behavior as KVM. It
only allows WBINVD in a guest when the guest supports VT-d (=DMA),
but just handles it as a nop if it doesn't .

If TDX ever gets DMA support, or persistent memory support, or
some other devices that can observe flushing side effects, a
hypercall can be added to implement it similar to AMD-SEV. But
current TDX does not need it.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
---

Changes since RFC v2-2:
 * Added more details to commit log and comments to address
   review comments.

Changes since RFC v2:
 * Fixed commit log as per review comments.
 * Removed WARN_ONCE for WBINVD #VE support.

 arch/x86/kernel/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index da5c9cd08299..775ae090b625 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -455,6 +455,13 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
 	case EXIT_REASON_EPT_VIOLATION:
 		ve->instr_len = tdg_handle_mmio(regs, ve);
 		break;
+	case EXIT_REASON_WBINVD:
+		/*
+		 * Non coherent DMA, persistent memory, MTRRs or
+		 * outdated ACPI sleeps are not supported in TDX guest.
+		 * So ignore WBINVD and treat it nop.
+		 */
+		break;
 	case EXIT_REASON_MONITOR_INSTRUCTION:
 	case EXIT_REASON_MWAIT_INSTRUCTION:
 		/*
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ