[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19db3f9edf0af05424fd16d0c7a1ea23766e8330.camel@intel.com>
Date: Thu, 28 Jul 2022 22:18:38 +1200
From: Kai Huang <kai.huang@...el.com>
To: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc: "H . Peter Anvin" <hpa@...or.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Wander Lairson Costa <wander@...hat.com>,
Isaku Yamahata <isaku.yamahata@...il.com>,
marcelo.cerri@...onical.com, tim.gardner@...onical.com,
khalid.elmously@...onical.com, philip.cox@...onical.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 3/6] x86/tdx: Add TDX Guest event notify interrupt
support
On Wed, 2022-07-27 at 20:44 -0700, Kuppuswamy Sathyanarayanan wrote:
> Host-guest event notification via configured interrupt vector is useful
> in cases where a guest makes an asynchronous request and needs a
> callback from the host to indicate the completion or to let the host
> notify the guest about events like device removal. One usage example is,
> callback requirement of GetQuote asynchronous hypercall.
>
> In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
> guest to specify which interrupt vector to use as an event-notify
> vector to the VMM. Details about the SetupEventNotifyInterrupt
> hypercall can be found in TDX Guest-Host Communication Interface
> (GHCI) Specification, sec 3.5 "VP.VMCALL<SetupEventNotifyInterrupt>".
> Add a tdx_hcall_set_notify_intr() helper function to implement the
> SetupEventNotifyInterrupt hypercall.
>
> As per design VMM will post the event completion IRQ using the same CPU
> in which SetupEventNotifyInterrupt hypercall request is received. So
> allocate an IRQ vector from "x86_vector_domain", and set the CPU
> affinity of the IRQ vector to the current CPU.
Set the affinity to the CPU isn't good enough. Please call out we need a non-
migratable IRQ here, i.e. userspace is not allowed to change the affinity which
can cause the vector being reallocated from another cpu.
>
> Please note that this patch only reserves the IRQ number. It is
> expected that the user of event notify IRQ (like GetQuote handler) will
> directly register the handler for "tdx_notify_irq" IRQ by using
> request_irq() with IRQF_SHARED flag set. Using IRQF_SHARED will enable
> multiple users to use the same IRQ for event notification.
>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Acked-by: Wander Lairson Costa <wander@...hat.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> ---
> arch/x86/coco/tdx/tdx.c | 84 ++++++++++++++++++++++++++++++++++++++
> arch/x86/include/asm/tdx.h | 2 +
> 2 files changed, 86 insertions(+)
>
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 205f98f42da8..3563b208979c 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -8,12 +8,16 @@
> #include <linux/miscdevice.h>
> #include <linux/mm.h>
> #include <linux/io.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/numa.h>
> #include <asm/coco.h>
> #include <asm/tdx.h>
> #include <asm/vmx.h>
> #include <asm/insn.h>
> #include <asm/insn-eval.h>
> #include <asm/pgtable.h>
> +#include <asm/irqdomain.h>
>
> #include "tdx.h"
>
> @@ -24,6 +28,7 @@
>
> /* TDX hypercall Leaf IDs */
> #define TDVMCALL_MAP_GPA 0x10001
> +#define TDVMCALL_SETUP_NOTIFY_INTR 0x10004
>
> /* MMIO direction */
> #define EPT_READ 0
> @@ -42,6 +47,7 @@
> #define DRIVER_NAME "tdx-guest"
>
> static struct miscdevice tdx_misc_dev;
> +int tdx_notify_irq = -1;
>
> /*
> * Wrapper for standard use of __tdx_hypercall with no output aside from
> @@ -107,6 +113,31 @@ static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
> }
>
> +/*
> + * tdx_hcall_set_notify_intr() - Setup Event Notify Interrupt Vector.
> + *
> + * @vector: Vector address to be used for notification.
> + *
> + * return 0 on success or failure error number.
> + */
> +static long tdx_hcall_set_notify_intr(u8 vector)
> +{
> + /* Minimum vector value allowed is 32 */
> + if (vector < 32)
> + return -EINVAL;
> +
> + /*
> + * Register callback vector address with VMM. More details
> + * about the ABI can be found in TDX Guest-Host-Communication
> + * Interface (GHCI), sec titled
> + * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
> + */
> + if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, vector, 0, 0, 0))
> + return -EIO;
> +
> + return 0;
> +}
> +
I don't think this function is super useful. I guess it's just better to openly
call the _tdx_hypercall() directly in below. With some comments, this way makes
the code more clear that we don't want any scheduling during the IRQ/vector
allocation and the hypervisor to guarantee the vector is always allocated on the
CPU where the hypercall is called.
Also, checking vector < 32 isn't needed, as the hypercall is guaranteed to
return error in such case as suggested by GHCI (otherwise it is VMM bug).
The irq_domain_alloc_irqs() call also guarantees the allocated vector will never
< 32. If really needed, you can just add a WARN_ON(vector < 32) right before
calling the hypercall.
> static u64 get_cc_mask(void)
> {
> struct tdx_module_output out;
> @@ -830,3 +861,56 @@ static int __init tdx_guest_init(void)
> return 0;
> }
> device_initcall(tdx_guest_init)
> +
> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
> +static int __init tdx_arch_init(void)
> +{
> + struct irq_alloc_info info;
> + struct irq_cfg *cfg;
> + int cpu;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> + return 0;
> +
> + /* Make sure x86 vector domain is initialized */
> + if (!x86_vector_domain) {
> + pr_err("x86 vector domain is NULL\n");
> + return 0;
> + }
I don't think this check is needed. x86_vector_domain is guaranteed to be non-
NULL in arch_early_irq_init():
x86_vector_domain = irq_domain_create_tree(...);
BUG_ON(x86_vector_domain == NULL);
> +
> + init_irq_alloc_info(&info, NULL);
> +
> + /*
> + * Event notification vector will be delivered to the CPU
> + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
> + * So set the IRQ affinity to the current CPU.
> + */
> + cpu = get_cpu();
> +
> + info.mask = cpumask_of(cpu);
> +
> + tdx_notify_irq = irq_domain_alloc_irqs(x86_vector_domain, 1,
> + NUMA_NO_NODE, &info);
> +
> + if (tdx_notify_irq < 0) {
> + pr_err("Event notification IRQ allocation failed %d\n",
> + tdx_notify_irq);
> + goto init_failed;
> + }
> +
> + irq_set_handler(tdx_notify_irq, handle_edge_irq);
> +
> + cfg = irq_cfg(tdx_notify_irq);
> + if (!cfg) {
> + pr_err("Event notification IRQ config not found\n");
> + goto init_failed;
> + }
> +
> + if (tdx_hcall_set_notify_intr(cfg->vector))
> + pr_err("Setting event notification interrupt failed\n");
So the request_irq() with IRQF_NOBALANCING isn't in this patch but in later
patch. Nor the IRQ is made affinity kernel-managed in this patch. This means
for this patch alone, the IRQ is migratable (i.e. usespace can change its
affinity). In another words, this patch depends on later patch to work. I
don't think this is right/good.
Perhaps we can explicitly call irq_modify_status() to set IRQF_NOBALANCING here?
irq_modify_status(tdx_notify_irq, 0, IRQ_NO_BALANCING);
But again, I am not expert here. It would be helpful if some expert can help to
check whether this is good way to handle.
--
Thanks,
-Kai
Powered by blists - more mailing lists