lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <65b9e7e66e46_5a9dd294ae@dwillia2-mobl3.amr.corp.intel.com.notmuch>
Date: Tue, 30 Jan 2024 22:25:42 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Isaku Yamahata <isaku.yamahata@...il.com>, Zhi Wang
	<zhi.wang.linux@...il.com>
CC: <isaku.yamahata@...el.com>, <kvm@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <isaku.yamahata@...il.com>, Paolo Bonzini
	<pbonzini@...hat.com>, <erdemaktas@...gle.com>, Sean Christopherson
	<seanjc@...gle.com>, Sagi Shahar <sagis@...gle.com>, David Matlack
	<dmatlack@...gle.com>, Kai Huang <kai.huang@...el.com>, Sean Christopherson
	<sean.j.christopherson@...el.com>
Subject: Re: [PATCH v13 016/113] KVM: TDX: x86: Add ioctl to get TDX
 systemwide parameters

Isaku Yamahata wrote:
> On Wed, Mar 29, 2023 at 04:17:22PM -0700,
> Isaku Yamahata <isaku.yamahata@...il.com> wrote:
> 
> > On Sat, Mar 25, 2023 at 10:43:06AM +0200,
> > Zhi Wang <zhi.wang.linux@...il.com> wrote:
> > 
> > > On Sun, 12 Mar 2023 10:55:40 -0700
> > > isaku.yamahata@...el.com wrote:
> > > 
> > > Does this have to be a new generic ioctl with a dedicated new x86_ops? SNP
> > > does not use it at all and all the system-scoped ioctl of SNP going through
> > > the CCP driver. So getting system-scope information of TDX/SNP will end up
> > > differently.
> > > 
> > > Any thought, Sean? Moving getting SNP system-wide information to
> > > KVM dev ioctl seems not ideal and TDX does not have a dedicated driver like
> > > CCP. Maybe make this ioctl TDX-specific? KVM_TDX_DEV_OP?
> > 
> > We only need global parameters of the TDX module, and we don't interact with TDX
> > module at this point.  One alternative is to export those parameters via sysfs.
> > Also the existence of the sysfs node indicates that the TDX module is
> > loaded(initialized?) or not in addition to boot log.  Thus we can drop system
> > scope one.
> > What do you think?
> > 
> > Regarding to other TDX KVM specific ioctls (KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU,
> > KVM_TDX_INIT_MEM_REGION, and KVM_TDX_FINALIZE_VM), they are specific to KVM.  So
> > I don't think it can be split out to independent driver.
> 
> Here is the patch to export those info via sysfs.
> 
> From e0744e506eb92e47d8317e489945a3ba804edfa7 Mon Sep 17 00:00:00 2001
> Message-Id: <e0744e506eb92e47d8317e489945a3ba804edfa7.1680221730.git.isaku.yamahata@...el.com>
> In-Reply-To: <8e0bc0e8e5d435f54f10c7642a862629ef2acb89.1680221729.git.isaku.yamahata@...el.com>
> References: <8e0bc0e8e5d435f54f10c7642a862629ef2acb89.1680221729.git.isaku.yamahata@...el.com>
> From: Isaku Yamahata <isaku.yamahata@...el.com>
> Date: Thu, 30 Mar 2023 00:05:03 -0700
> Subject: [PATCH] x86/virt/tdx: Export TD config params of TDX module via sysfs
> 
> TDX module has parameters for VMM to configure TD.  User space VMM, e.g.
> qemu, needs to know it. Export them to user space via sysfs.
> 
> TDX 1.0 provides TDH.SYS.INFO to provide system information in
> TDSYSINFO_STRUCT.  Its future extensibility is limited because of its
> struct.  From TDX 1.5, TDH.SYS.RD(metadata field_id) to read the info
> specified by field id.  So instead of exporting TDSYSINFO_STRUCT, adapt
> metadata way to export those system information.

Hi, I came across tdx_sysfs_init() recently and had some comments if
this proposal is going to move forward:

> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> ---
>  Documentation/ABI/testing/sysfs-firmware-tdx |  23 +++
>  arch/x86/include/asm/tdx.h                   |  33 ++++
>  arch/x86/virt/vmx/tdx/tdx.c                  | 164 +++++++++++++++++++
>  arch/x86/virt/vmx/tdx/tdx.h                  |  18 ++
>  4 files changed, 238 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-firmware-tdx
> 
> diff --git a/Documentation/ABI/testing/sysfs-firmware-tdx b/Documentation/ABI/testing/sysfs-firmware-tdx
> new file mode 100644
> index 000000000000..1f26fb178144
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-tdx
> @@ -0,0 +1,23 @@
> +What:           /sys/firmware/tdx/tdx_module/metadata

The TDX module is not "platform firmware" in comparison to the other EFI
and ACPI inhabitants in /sys/firmware. It is especially not static
platform firmware given it needs to be dynamically activated via KVM
module initialization.

Instead, sysfs already has a location for pure software construct
objects to host a sysfs ABI and that is /sys/bus/virtual. I propose a
common "TSM" class device here [1] and TDX can simply publish a named
attribute group, "host", to extend that class device with TDX specifics.

For cross-vendor consistency "host" is a symlink to the CCP device on
AMD.

[1]: http://lore.kernel.org/r/170660662589.224441.11503798303914595072.stgit@dwillia2-xfh.jf.intel.com

> +Date:           March 2023
> +KernelVersion:  6.3
> +Contact:        Isaku Yamahata <isaku.yamahata@...el.com>, kvm@...r.kernel.org
> +Users:          qemu, libvirt
> +Description:
> +                The TDX feature requires a firmware that is known as the TDX
> +                module.  The TDX module exposes its metadata in the following
> +                read-only files.  The information corresponds to the TDX global
> +                metadata specified by 64bit field id.
> +                string in lower case.  The value is binary.
> +                User space VMM like qemu needs refer to them to determine what
> +                parameters are needed or allowed to configure guest TDs.
> +
> +                ================== ============================================
> +                1900000300000000   ATTRIBUTES_FIXED0
> +                1900000300000001   ATTRIBUTES_FIXED1
> +                1900000300000002   XFAM_FIXED0
> +                1900000300000003   XFAM_FIXED1
> +                9900000100000004   NUM_CPUID_CONFIG
> +                9900000300000400   CPUID_LEAVES
> +                9900000300000500   CPUID_VALUES
> +                ================== ============================================

This documentation needs to be per file. With an explanation of how each
file is expected to be used. Someone should reasonably be able to read
this documentation and go write a tool, I don't get that from this
documentation.

> \ No newline at end of file
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 05870e5ed131..c650ac22a916 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -110,6 +110,39 @@ struct tdx_cpuid_config {
>  	u32	edx;
>  } __packed;
>  
> +struct tdx_cpuid_config_leaf {
> +	u32	leaf;
> +	u32	sub_leaf;
> +} __packed;
> +static_assert(offsetof(struct tdx_cpuid_config, leaf) ==
> +	      offsetof(struct tdx_cpuid_config_leaf, leaf));
> +static_assert(offsetof(struct tdx_cpuid_config, sub_leaf) ==
> +	      offsetof(struct tdx_cpuid_config_leaf, sub_leaf));
> +static_assert(offsetofend(struct tdx_cpuid_config, sub_leaf) ==
> +	      sizeof(struct tdx_cpuid_config_leaf));
> +
> +struct tdx_cpuid_config_value {
> +	u32	eax;
> +	u32	ebx;
> +	u32	ecx;
> +	u32	edx;
> +} __packed;
> +static_assert(offsetof(struct tdx_cpuid_config, eax) -
> +	      offsetof(struct tdx_cpuid_config, eax) ==
> +	      offsetof(struct tdx_cpuid_config_value, eax));
> +static_assert(offsetof(struct tdx_cpuid_config, ebx) -
> +	      offsetof(struct tdx_cpuid_config, eax) ==
> +	      offsetof(struct tdx_cpuid_config_value, ebx));
> +static_assert(offsetof(struct tdx_cpuid_config, ecx) -
> +	      offsetof(struct tdx_cpuid_config, eax) ==
> +	      offsetof(struct tdx_cpuid_config_value, ecx));
> +static_assert(offsetof(struct tdx_cpuid_config, edx) -
> +	      offsetof(struct tdx_cpuid_config, eax) ==
> +	      offsetof(struct tdx_cpuid_config_value, edx));
> +static_assert(offsetofend(struct tdx_cpuid_config, edx) -
> +	      offsetof(struct tdx_cpuid_config, eax) ==
> +	      sizeof(struct tdx_cpuid_config_value));
> +
>  #define TDSYSINFO_STRUCT_SIZE		1024
>  #define TDSYSINFO_STRUCT_ALIGNMENT	1024
>  
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index f9f9c1b76501..56ca520d67d6 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -33,6 +33,12 @@
>  #include <asm/tdx.h>
>  #include "tdx.h"
>  
> +#ifdef CONFIG_SYSFS
> +static int tdx_sysfs_init(void);
> +#else
> +static inline int tdx_sysfs_init(void) { return 0;}
> +#endif
> +
>  u32 tdx_global_keyid __ro_after_init;
>  EXPORT_SYMBOL_GPL(tdx_global_keyid);
>  static u32 tdx_guest_keyid_start __ro_after_init;
> @@ -399,6 +405,10 @@ static int __tdx_get_sysinfo(struct tdsysinfo_struct *sysinfo,
>  	if (ret)
>  		return ret;
>  
> +	ret = tdx_sysfs_init();
> +	if (ret)
> +		return ret;
> +
>  	pr_info("TDX module: atributes 0x%x, vendor_id 0x%x, major_version %u, minor_version %u, build_date %u, build_num %u",
>  		sysinfo->attributes,	sysinfo->vendor_id,
>  		sysinfo->major_version, sysinfo->minor_version,
> @@ -1367,3 +1377,157 @@ int tdx_enable(void)
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(tdx_enable);
> +
> +#ifdef CONFIG_SYSFS
> +
> +static struct kobject *tdx_kobj;
> +static struct kobject *tdx_module_kobj;
> +static struct kobject *tdx_metadata_kobj;
> +
> +#define TDX_METADATA_ATTR(_name, field_id_name, _size)		\
> +static struct bin_attribute tdx_metadata_ ## _name = {		\
> +	.attr = {						\
> +		.name = field_id_name,				\
> +		.mode = 0444,					\
> +	},							\
> +	.size = _size,						\
> +	.read = tdx_metadata_ ## _name ## _show,		\
> +}
> +
> +#define TDX_METADATA_ATTR_SHOW(_name, field_id_name)					\
> +static ssize_t tdx_metadata_ ## _name ## _show(struct file *filp, struct kobject *kobj,	\
> +					       struct bin_attribute *bin_attr,		\
> +					       char *buf, loff_t offset, size_t count)	\
> +{											\
> +	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);			\
> +											\
> +	return memory_read_from_buffer(buf, count, &offset,				\
> +				       &sysinfo->_name,					\
> +				       sizeof(sysinfo->_name));				\
> +}											\
> +TDX_METADATA_ATTR(_name, field_id_name, sizeof_field(struct tdsysinfo_struct, _name))
> +
> +TDX_METADATA_ATTR_SHOW(attributes_fixed0, TDX_METADATA_ATTRIBUTES_FIXED0_NAME);
> +TDX_METADATA_ATTR_SHOW(attributes_fixed1, TDX_METADATA_ATTRIBUTES_FIXED1_NAME);
> +TDX_METADATA_ATTR_SHOW(xfam_fixed0, TDX_METADATA_XFAM_FIXED0_NAME);
> +TDX_METADATA_ATTR_SHOW(xfam_fixed1, TDX_METADATA_XFAM_FIXED1_NAME);
> +
> +static ssize_t tdx_metadata_num_cpuid_config_show(struct file *filp, struct kobject *kobj,
> +						  struct bin_attribute *bin_attr,
> +						  char *buf, loff_t offset, size_t count)
> +{
> +	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> +	/*
> +	 * Although tdsysinfo_struct.num_cpuid_config is defined as u32 for
> +	 * alignment, TDX 1.5 defines metadata NUM_CONFIG_CPUID as u16.
> +	 */
> +	u16 tmp = (u16)sysinfo->num_cpuid_config;
> +
> +	WARN_ON_ONCE(tmp != sysinfo->num_cpuid_config);

Why crash the kernel here?

> +	return memory_read_from_buffer(buf, count, &offset, &tmp, sizeof(tmp));
> +}
> +TDX_METADATA_ATTR(num_cpuid_config, TDX_METADATA_NUM_CPUID_CONFIG_NAME, sizeof(u16));
> +
> +static ssize_t tdx_metadata_cpuid_leaves_show(struct file *filp, struct kobject *kobj,
> +					      struct bin_attribute *bin_attr, char *buf,
> +					      loff_t offset, size_t count)
> +{
> +	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> +	ssize_t r;
> +	struct tdx_cpuid_config_leaf *tmp;
> +	u32 i;
> +
> +	tmp = kmalloc(bin_attr->size, GFP_KERNEL);
> +	if (!tmp)
> +		return -ENOMEM;

Why is this allocating and then blindly copying bin_attr->size into
@buf? It it either knows that @buf is big enough, no need to allocate,
or if it does not know if @buf is big enough then the copy into @tmp
offers no protection.

> +
> +	for (i = 0; i < sysinfo->num_cpuid_config; i++) {
> +		struct tdx_cpuid_config *c = &sysinfo->cpuid_configs[i];
> +		struct tdx_cpuid_config_leaf *leaf = (struct tdx_cpuid_config_leaf *)c;
> +
> +		memcpy(tmp + i, leaf, sizeof(*leaf));
> +	}
> +
> +	r = memory_read_from_buffer(buf, count, &offset, tmp, bin_attr->size);
> +	kfree(tmp);
> +	return r;
> +}
> +
> +TDX_METADATA_ATTR(cpuid_leaves, TDX_METADATA_CPUID_LEAVES_NAME, 0);
> +
> +static ssize_t tdx_metadata_cpuid_values_show(struct file *filp, struct kobject *kobj,
> +					      struct bin_attribute *bin_attr, char *buf,
> +					      loff_t offset, size_t count)
> +{
> +	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> +	struct tdx_cpuid_config_value *tmp;
> +	ssize_t r;
> +	u32 i;
> +
> +	tmp = kmalloc(bin_attr->size, GFP_KERNEL);
> +	if (!tmp)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < sysinfo->num_cpuid_config; i++) {
> +		struct tdx_cpuid_config *c = &sysinfo->cpuid_configs[i];
> +		struct tdx_cpuid_config_value *value = (struct tdx_cpuid_config_value *)&c->eax;
> +
> +		memcpy(tmp + i, value, sizeof(*value));
> +	}
> +
> +	r = memory_read_from_buffer(buf, count, &offset, tmp, bin_attr->size);
> +	kfree(tmp);
> +	return r;
> +}
> +
> +TDX_METADATA_ATTR(cpuid_values, TDX_METADATA_CPUID_VALUES_NAME, 0);
> +
> +static struct bin_attribute *tdx_metadata_attrs[] = {
> +	&tdx_metadata_attributes_fixed0,
> +	&tdx_metadata_attributes_fixed1,
> +	&tdx_metadata_xfam_fixed0,
> +	&tdx_metadata_xfam_fixed1,
> +	&tdx_metadata_num_cpuid_config,
> +	&tdx_metadata_cpuid_leaves,
> +	&tdx_metadata_cpuid_values,
> +	NULL,
> +};
> +
> +static const struct attribute_group tdx_metadata_attr_group = {
> +	.bin_attrs = tdx_metadata_attrs,
> +};
> +
> +static int tdx_sysfs_init(void)
> +{
> +	struct tdsysinfo_struct *sysinfo;
> +	int ret;
> +
> +	tdx_kobj = kobject_create_and_add("tdx", firmware_kobj);
> +	if (!tdx_kobj) {
> +		pr_err("kobject_create_and_add tdx failed\n");
> +		return -EINVAL;
> +	}

Subsystems, PCI for example [2], are slowly unwinding their usage of dynamic
sysfs_create_*() APIs in favor of static attribute groups. Dynamic
kobject_create_*() usage is even more of an anti-pattern for new code.

This goes away with static attribute group registration.

[2]: https://lore.kernel.org/linux-pci/20231019200110.GA1410324@bhelgaas/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ