[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62bd9c59-59c9-0352-b1c5-7336cdf2552d@linux.intel.com>
Date: Mon, 1 Aug 2022 14:39:45 -0700
From: Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Kai Huang <kai.huang@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc: "H . Peter Anvin" <hpa@...or.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Wander Lairson Costa <wander@...hat.com>,
Isaku Yamahata <isaku.yamahata@...il.com>,
marcelo.cerri@...onical.com, tim.gardner@...onical.com,
khalid.elmously@...onical.com, philip.cox@...onical.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 3/6] x86/tdx: Add TDX Guest event notify interrupt
support
On 7/28/22 3:18 AM, Kai Huang wrote:
> On Wed, 2022-07-27 at 20:44 -0700, Kuppuswamy Sathyanarayanan wrote:
>> Host-guest event notification via configured interrupt vector is useful
>> in cases where a guest makes an asynchronous request and needs a
>> callback from the host to indicate the completion or to let the host
>> notify the guest about events like device removal. One usage example is,
>> callback requirement of GetQuote asynchronous hypercall.
>>
>> In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
>> guest to specify which interrupt vector to use as an event-notify
>> vector to the VMM. Details about the SetupEventNotifyInterrupt
>> hypercall can be found in TDX Guest-Host Communication Interface
>> (GHCI) Specification, sec 3.5 "VP.VMCALL<SetupEventNotifyInterrupt>".
>> Add a tdx_hcall_set_notify_intr() helper function to implement the
>> SetupEventNotifyInterrupt hypercall.
>>
>> As per design VMM will post the event completion IRQ using the same CPU
>> in which SetupEventNotifyInterrupt hypercall request is received. So
>> allocate an IRQ vector from "x86_vector_domain", and set the CPU
>> affinity of the IRQ vector to the current CPU.
>
> Set the affinity to the CPU isn't good enough. Please call out we need a non-
> migratable IRQ here, i.e. userspace is not allowed to change the affinity which
> can cause the vector being reallocated from another cpu.
Agree. I will include the relevant details.
>
>>
>> Please note that this patch only reserves the IRQ number. It is
>> expected that the user of event notify IRQ (like GetQuote handler) will
>> directly register the handler for "tdx_notify_irq" IRQ by using
>> request_irq() with IRQF_SHARED flag set. Using IRQF_SHARED will enable
>> multiple users to use the same IRQ for event notification.
>>
>> Reviewed-by: Tony Luck <tony.luck@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> Acked-by: Wander Lairson Costa <wander@...hat.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>> arch/x86/coco/tdx/tdx.c | 84 ++++++++++++++++++++++++++++++++++++++
>> arch/x86/include/asm/tdx.h | 2 +
>> 2 files changed, 86 insertions(+)
>>
>> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
>> index 205f98f42da8..3563b208979c 100644
>> --- a/arch/x86/coco/tdx/tdx.c
>> +++ b/arch/x86/coco/tdx/tdx.c
>> @@ -8,12 +8,16 @@
>> #include <linux/miscdevice.h>
>> #include <linux/mm.h>
>> #include <linux/io.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/irq.h>
>> +#include <linux/numa.h>
>> #include <asm/coco.h>
>> #include <asm/tdx.h>
>> #include <asm/vmx.h>
>> #include <asm/insn.h>
>> #include <asm/insn-eval.h>
>> #include <asm/pgtable.h>
>> +#include <asm/irqdomain.h>
>>
>> #include "tdx.h"
>>
>> @@ -24,6 +28,7 @@
>>
>> /* TDX hypercall Leaf IDs */
>> #define TDVMCALL_MAP_GPA 0x10001
>> +#define TDVMCALL_SETUP_NOTIFY_INTR 0x10004
>>
>> /* MMIO direction */
>> #define EPT_READ 0
>> @@ -42,6 +47,7 @@
>> #define DRIVER_NAME "tdx-guest"
>>
>> static struct miscdevice tdx_misc_dev;
>> +int tdx_notify_irq = -1;
>>
>> /*
>> * Wrapper for standard use of __tdx_hypercall with no output aside from
>> @@ -107,6 +113,31 @@ static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
>> panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
>> }
>>
>> +/*
>> + * tdx_hcall_set_notify_intr() - Setup Event Notify Interrupt Vector.
>> + *
>> + * @vector: Vector address to be used for notification.
>> + *
>> + * return 0 on success or failure error number.
>> + */
>> +static long tdx_hcall_set_notify_intr(u8 vector)
>> +{
>> + /* Minimum vector value allowed is 32 */
>> + if (vector < 32)
>> + return -EINVAL;
>> +
>> + /*
>> + * Register callback vector address with VMM. More details
>> + * about the ABI can be found in TDX Guest-Host-Communication
>> + * Interface (GHCI), sec titled
>> + * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
>> + */
>> + if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, vector, 0, 0, 0))
>> + return -EIO;
>> +
>> + return 0;
>> +}
>> +
>
> I don't think this function is super useful. I guess it's just better to openly
> call the _tdx_hypercall() directly in below. With some comments, this way makes
> the code more clear that we don't want any scheduling during the IRQ/vector
> allocation and the hypervisor to guarantee the vector is always allocated on the
> CPU where the hypercall is called.
>
> Also, checking vector < 32 isn't needed, as the hypercall is guaranteed to
> return error in such case as suggested by GHCI (otherwise it is VMM bug).
>
> The irq_domain_alloc_irqs() call also guarantees the allocated vector will never
> < 32. If really needed, you can just add a WARN_ON(vector < 32) right before
> calling the hypercall.
Agree. I will move this to tdx_arch_init().
>
>> static u64 get_cc_mask(void)
>> {
>> struct tdx_module_output out;
>> @@ -830,3 +861,56 @@ static int __init tdx_guest_init(void)
>> return 0;
>> }
>> device_initcall(tdx_guest_init)
>> +
>> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
>> +static int __init tdx_arch_init(void)
>> +{
>> + struct irq_alloc_info info;
>> + struct irq_cfg *cfg;
>> + int cpu;
>> +
>> + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>> + return 0;
>> +
>> + /* Make sure x86 vector domain is initialized */
>> + if (!x86_vector_domain) {
>> + pr_err("x86 vector domain is NULL\n");
>> + return 0;
>> + }
>
> I don't think this check is needed. x86_vector_domain is guaranteed to be non-
> NULL in arch_early_irq_init():
>
> x86_vector_domain = irq_domain_create_tree(...);
>
> BUG_ON(x86_vector_domain == NULL);
Makes sense. I can remove the above check.
>
>> +
>> + init_irq_alloc_info(&info, NULL);
>> +
>> + /*
>> + * Event notification vector will be delivered to the CPU
>> + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>> + * So set the IRQ affinity to the current CPU.
>> + */
>> + cpu = get_cpu();
>> +
>> + info.mask = cpumask_of(cpu);
>> +
>> + tdx_notify_irq = irq_domain_alloc_irqs(x86_vector_domain, 1,
>> + NUMA_NO_NODE, &info);
>> +
>> + if (tdx_notify_irq < 0) {
>> + pr_err("Event notification IRQ allocation failed %d\n",
>> + tdx_notify_irq);
>> + goto init_failed;
>> + }
>> +
>> + irq_set_handler(tdx_notify_irq, handle_edge_irq);
>> +
>> + cfg = irq_cfg(tdx_notify_irq);
>> + if (!cfg) {
>> + pr_err("Event notification IRQ config not found\n");
>> + goto init_failed;
>> + }
>> +
>> + if (tdx_hcall_set_notify_intr(cfg->vector))
>> + pr_err("Setting event notification interrupt failed\n");
>
> So the request_irq() with IRQF_NOBALANCING isn't in this patch but in later
> patch. Nor the IRQ is made affinity kernel-managed in this patch. This means
> for this patch alone, the IRQ is migratable (i.e. usespace can change its
> affinity). In another words, this patch depends on later patch to work. I
> don't think this is right/good.
>
> Perhaps we can explicitly call irq_modify_status() to set IRQF_NOBALANCING here?
>
> irq_modify_status(tdx_notify_irq, 0, IRQ_NO_BALANCING);
I just tested it. It seems to work fine. With the above change, I see
IRQD_NO_BALANCING dstate flag (in /sys/kernel/debug/irq/irqs/xx). I will
include this change.
>
> But again, I am not expert here. It would be helpful if some expert can help to
> check whether this is good way to handle.
>
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Powered by blists - more mailing lists