linux-kernel - Re: [PATCH v9 3/6] x86/tdx: Add TDX Guest event notify interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62bd9c59-59c9-0352-b1c5-7336cdf2552d@linux.intel.com>
Date:   Mon, 1 Aug 2022 14:39:45 -0700
From:   Sathyanarayanan Kuppuswamy 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Kai Huang <kai.huang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc:     "H . Peter Anvin" <hpa@...or.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Tony Luck <tony.luck@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Wander Lairson Costa <wander@...hat.com>,
        Isaku Yamahata <isaku.yamahata@...il.com>,
        marcelo.cerri@...onical.com, tim.gardner@...onical.com,
        khalid.elmously@...onical.com, philip.cox@...onical.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 3/6] x86/tdx: Add TDX Guest event notify interrupt
 support



On 7/28/22 3:18 AM, Kai Huang wrote:
> On Wed, 2022-07-27 at 20:44 -0700, Kuppuswamy Sathyanarayanan wrote:
>> Host-guest event notification via configured interrupt vector is useful
>> in cases where a guest makes an asynchronous request and needs a
>> callback from the host to indicate the completion or to let the host
>> notify the guest about events like device removal. One usage example is,
>> callback requirement of GetQuote asynchronous hypercall.
>>
>> In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
>> guest to specify which interrupt vector to use as an event-notify
>> vector to the VMM. Details about the SetupEventNotifyInterrupt
>> hypercall can be found in TDX Guest-Host Communication Interface
>> (GHCI) Specification, sec 3.5 "VP.VMCALL<SetupEventNotifyInterrupt>".
>> Add a tdx_hcall_set_notify_intr() helper function to implement the
>> SetupEventNotifyInterrupt hypercall.
>>
>> As per design VMM will post the event completion IRQ using the same CPU
>> in which SetupEventNotifyInterrupt hypercall request is received. So
>> allocate an IRQ vector from "x86_vector_domain", and set the CPU
>> affinity of the IRQ vector to the current CPU.
> 
> Set the affinity to the CPU isn't good enough.  Please call out we need a non-
> migratable IRQ here, i.e. userspace is not allowed to change the affinity which
> can cause the vector being reallocated from another cpu.

Agree. I will include the relevant details.

> 
>>
>> Please note that this patch only reserves the IRQ number. It is
>> expected that the user of event notify IRQ (like GetQuote handler) will
>> directly register the handler for "tdx_notify_irq" IRQ by using
>> request_irq() with  IRQF_SHARED flag set. Using IRQF_SHARED will enable
>> multiple users to use the same IRQ for event notification.


>>
>> Reviewed-by: Tony Luck <tony.luck@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> Acked-by: Wander Lairson Costa <wander@...hat.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>>  arch/x86/coco/tdx/tdx.c    | 84 ++++++++++++++++++++++++++++++++++++++
>>  arch/x86/include/asm/tdx.h |  2 +
>>  2 files changed, 86 insertions(+)
>>
>> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
>> index 205f98f42da8..3563b208979c 100644
>> --- a/arch/x86/coco/tdx/tdx.c
>> +++ b/arch/x86/coco/tdx/tdx.c
>> @@ -8,12 +8,16 @@
>>  #include <linux/miscdevice.h>
>>  #include <linux/mm.h>
>>  #include <linux/io.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/irq.h>
>> +#include <linux/numa.h>
>>  #include <asm/coco.h>
>>  #include <asm/tdx.h>
>>  #include <asm/vmx.h>
>>  #include <asm/insn.h>
>>  #include <asm/insn-eval.h>
>>  #include <asm/pgtable.h>
>> +#include <asm/irqdomain.h>
>>  
>>  #include "tdx.h"
>>  
>> @@ -24,6 +28,7 @@
>>  
>>  /* TDX hypercall Leaf IDs */
>>  #define TDVMCALL_MAP_GPA		0x10001
>> +#define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
>>  
>>  /* MMIO direction */
>>  #define EPT_READ	0
>> @@ -42,6 +47,7 @@
>>  #define DRIVER_NAME	"tdx-guest"
>>  
>>  static struct miscdevice tdx_misc_dev;
>> +int tdx_notify_irq = -1;
>>  
>>  /*
>>   * Wrapper for standard use of __tdx_hypercall with no output aside from
>> @@ -107,6 +113,31 @@ static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
>>  		panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
>>  }
>>  
>> +/*
>> + * tdx_hcall_set_notify_intr() - Setup Event Notify Interrupt Vector.
>> + *
>> + * @vector: Vector address to be used for notification.
>> + *
>> + * return 0 on success or failure error number.
>> + */
>> +static long tdx_hcall_set_notify_intr(u8 vector)
>> +{
>> +	/* Minimum vector value allowed is 32 */
>> +	if (vector < 32)
>> +		return -EINVAL;
>> +
>> +	/*
>> +	 * Register callback vector address with VMM. More details
>> +	 * about the ABI can be found in TDX Guest-Host-Communication
>> +	 * Interface (GHCI), sec titled
>> +	 * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
>> +	 */
>> +	if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, vector, 0, 0, 0))
>> +		return -EIO;
>> +
>> +	return 0;
>> +}
>> +
> 
> I don't think this function is super useful.  I guess it's just better to openly
> call the _tdx_hypercall() directly in below.  With some comments, this way makes
> the code more clear that we don't want any scheduling during the IRQ/vector
> allocation and the hypervisor to guarantee the vector is always allocated on the
> CPU where the hypercall is called.
> 
> Also, checking vector < 32 isn't needed, as the hypercall is guaranteed to
> return error in such case as suggested by GHCI (otherwise it is VMM bug).
> 
> The irq_domain_alloc_irqs() call also guarantees the allocated vector will never
> < 32.  If really needed, you can just add a WARN_ON(vector < 32) right before
> calling the hypercall.

Agree. I will move this to tdx_arch_init().

> 
>>  static u64 get_cc_mask(void)
>>  {
>>  	struct tdx_module_output out;
>> @@ -830,3 +861,56 @@ static int __init tdx_guest_init(void)
>>  	return 0;
>>  }
>>  device_initcall(tdx_guest_init)
>> +
>> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
>> +static int __init tdx_arch_init(void)
>> +{
>> +	struct irq_alloc_info info;
>> +	struct irq_cfg *cfg;
>> +	int cpu;
>> +
>> +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>> +		return 0;
>> +
>> +	/* Make sure x86 vector domain is initialized */
>> +	if (!x86_vector_domain) {
>> +		pr_err("x86 vector domain is NULL\n");
>> +		return 0;
>> +	}
> 
> I don't think  this check is needed.  x86_vector_domain is guaranteed to be non-
> NULL in arch_early_irq_init():
> 
>         x86_vector_domain = irq_domain_create_tree(...);
> 
>         BUG_ON(x86_vector_domain == NULL);

Makes sense. I can remove the above check.

> 
>> +
>> +	init_irq_alloc_info(&info, NULL);
>> +
>> +	/*
>> +	 * Event notification vector will be delivered to the CPU
>> +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>> +	 * So set the IRQ affinity to the current CPU.
>> +	 */
>> +	cpu = get_cpu();
>> +
>> +	info.mask = cpumask_of(cpu);
>> +
>> +	tdx_notify_irq = irq_domain_alloc_irqs(x86_vector_domain, 1,
>> +				NUMA_NO_NODE, &info);
>> +
>> +	if (tdx_notify_irq < 0) {
>> +		pr_err("Event notification IRQ allocation failed %d\n",
>> +				tdx_notify_irq);
>> +		goto init_failed;
>> +	}
>> +
>> +	irq_set_handler(tdx_notify_irq, handle_edge_irq);
>> +
>> +	cfg = irq_cfg(tdx_notify_irq);
>> +	if (!cfg) {
>> +		pr_err("Event notification IRQ config not found\n");
>> +		goto init_failed;
>> +	}
>> +
>> +	if (tdx_hcall_set_notify_intr(cfg->vector))
>> +		pr_err("Setting event notification interrupt failed\n");
> 
> So the request_irq() with IRQF_NOBALANCING isn't in this patch but in later
> patch.  Nor the IRQ is made affinity kernel-managed in this patch.  This means
> for this patch alone, the IRQ is migratable (i.e. usespace can change its
> affinity).  In another words, this patch depends on later patch to work.  I
> don't think this is right/good.
> 
> Perhaps we can explicitly call irq_modify_status() to set IRQF_NOBALANCING here?
> 
> 	irq_modify_status(tdx_notify_irq, 0, IRQ_NO_BALANCING);


I just tested it. It seems to work fine. With the above change, I see 
IRQD_NO_BALANCING dstate flag (in /sys/kernel/debug/irq/irqs/xx). I will
include this change.


> 
> But again, I am not expert here.  It would be helpful if some expert can help to
> check whether this is good way to handle.
> 
> 



-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer