linux-kernel - Re: [PATCH v3 1/5] tsm-mr: Add TVM Measurement Register support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e4b6bf2-b76d-4fd6-8ff4-48551bf28784@intel.com>
Date: Thu, 10 Apr 2025 23:01:54 -0500
From: "Xing, Cedric" <cedric.xing@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, "Kirill A. Shutemov"
	<kirill.shutemov@...ux.intel.com>, Dave Hansen <dave.hansen@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, <x86@...nel.org>, "H. Peter Anvin"
	<hpa@...or.com>
CC: <linux-kernel@...r.kernel.org>, <linux-coco@...ts.linux.dev>, "Dionna
 Amalie Glaze" <dionnaglaze@...gle.com>, Guorui Yu
	<guorui.yu@...ux.alibaba.com>, James Bottomley
	<James.Bottomley@...senpartnership.com>, Dan Middleton
	<dan.middleton@...ux.intel.com>, Mikko Ylinen <mikko.ylinen@...ux.intel.com>,
	Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>
Subject: Re: [PATCH v3 1/5] tsm-mr: Add TVM Measurement Register support

On 4/8/2025 7:27 PM, Dan Williams wrote:
> Cedric Xing wrote:
[...]

>> ---
>>   MAINTAINERS                |   4 +-
>>   drivers/virt/coco/Kconfig  |   5 ++
>>   drivers/virt/coco/Makefile |   1 +
>>   drivers/virt/coco/tsm-mr.c | 209 +++++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/tsm-mr.h     |  93 ++++++++++++++++++++
> 
> I note that the pending proposals for TEE I/O suggests splitting
> drivers/virt/coco/ into drivers/virt/coco/{host,guest} [1] [2] [3].
> 
> [1]: http://lore.kernel.org/174107246021.1288555.7203769833791489618.stgit@dwillia2-xfh.jf.intel.com
> [2]: http://lore.kernel.org/20250218111017.491719-8-aik@amd.com
> [3]: http://lore.kernel.org/20250218111017.491719-17-aik@amd.com
> 
> So if I take this through devsec.git I will get that rename queued and
> ask you to rebase on top of that.
> 
No problem.

[...]

>> +	/*
>> +	 * @ctx->in_sync indicates if any MRs have been written since the last
>> +	 * ctx->refresh() call. When @ctx->in_sync is false, ctx->refresh() is
>> +	 * necessary to sync the cached values of all live MRs (i.e., with
>> +	 * %TSM_MR_F_LIVE set) with the underlying hardware.
>> +	 */
> 
> Code comments should add to the understanding of the code, not simply
> restate the code in prose. So I would replace this comment with some
> non-obvious insight to aid future maintenance, something like:
> 
> /*
>   * Note that the typical read path for MRs is via an attestation report,
>   * this is why the ->write() path does not automatically ->refresh()
>   * invalidated data for TSM_MR_LIVE registers. The use case for reading
>   * back a individual hash-extending-write to an MR is for debug not
>   * attestation.
>   */
>
> ...at least an explanation like that would have helped me understand the
> locking and caching model of this implementation.
>
The reasoning behind this involves not only ->refresh() and ->write() 
but also LIVE and ->in_sync. Generally, both ->refresh() and ->write() 
could be expensive so we are trying to do only the minimum. I'll add 
comments to the definition of this context structure.

[...]

>> +static ssize_t tm_digest_write(struct file *filp, struct kobject *kobj,
>> +			       const struct bin_attribute *attr, char *buffer,
>> +			       loff_t off, size_t count)
>> +{
>> +	struct tm_context *ctx;
>> +	const struct tsm_measurement_register *mr;
>> +	ssize_t rc;
>> +
>> +	/* partial writes are not supported */
>> +	if (off != 0 || count != attr->size)
>> +		return -EINVAL;
>> +
>> +	ctx = attr->private;
>> +	mr = &ctx->tm->mrs[attr - ctx->mrs];
>> +
>> +	rc = down_write_killable(&ctx->rwsem);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = ctx->tm->write(ctx->tm, mr, buffer);
> 
> There needs to be explicit ABI and driver-api documentation here for what are the
> allowed error codes that ->write() can return so as not to be confused
> with EINVAL or EINTR arising from user error or interrupt.
> 
I understand your point. But different CC archs may have arch specific 
reasons for failures. It'd be hard to whitelist all the allowed errors; 
while blacklisting EINVAL may make more sense, as users have no control 
over the input (hence cannot provide invalid input) to arch specific 
write/extend functions. I'll add to the description of ->write() in its 
kernel-doc comments.

[...]

>> +/**
>> + * tsm_mr_create_attribute_group() - creates an attribute group for measurement
>> + * registers
>> + * @tm: pointer to &struct tsm_measurements containing the MR definitions.
>> + *
>> + * This function creates attributes corresponding to the MR definitions
>> + * provided by @tm->mrs.
>> + *
>> + * The created attributes will reference @tm and its members. The caller must
>> + * not free @tm until after tsm_mr_free_attribute_group() is called.
>> + *
>> + * Context: Process context. May sleep due to memory allocation.
>> + *
>> + * Return:
>> + * * On success, the pointer to a an attribute group is returned; otherwise
>> + * * %-EINVAL - Invalid MR definitions.
>> + * * %-ENOMEM - Out of memory.
>> + */
>> +const struct attribute_group *__must_check
> 
> No need to mark this function as __must_check. That attribute is
> typically reserved for core-apis.
> 
Will remove.

>> +tsm_mr_create_attribute_group(const struct tsm_measurements *tm)
>> +{
>> +	if (!tm->mrs)
>> +		return ERR_PTR(-EINVAL);
> 
> If you're going to check for !tm->mrs, might as well also check for !tm.
> 
Good catch! Will change.

>> +
>> +	/* aggregated length of all MR names */
>> +	size_t nlen = 0;
> 
> Typically the only exceptions for not declaring variables at the top of
> a function are for "for ()" loops and scope-based cleanup.
> 
Will move it to the top.

>> +
>> +	for (size_t i = 0; i < tm->nr_mrs; ++i) {
>> +		if ((tm->mrs[i].mr_flags & TSM_MR_F_LIVE) && !tm->refresh)
>> +			return ERR_PTR(-EINVAL);
>> +
>> +		if ((tm->mrs[i].mr_flags & TSM_MR_F_WRITABLE) && !tm->write)
>> +			return ERR_PTR(-EINVAL);
>> +
>> +		if (tm->mrs[i].mr_flags & TSM_MR_F_NOHASH)
>> +			continue;
>> +
>> +		if (WARN_ON(tm->mrs[i].mr_hash >= HASH_ALGO__LAST))
> 
> Why potentially crash the kernel here? EINVAL should be sufficient.
> 
Agreed! Will change.

>> +			return ERR_PTR(-EINVAL);
>> +
>> +		/* MR sysfs attribute names have the form of MRNAME:HASH */
>> +		nlen += strlen(tm->mrs[i].mr_name) + 1 +
>> +			strlen(hash_algo_name[tm->mrs[i].mr_hash]) + 1;
>> +	}
>> +
>> +	/*
>> +	 * @bas and the MR name strings are combined into a single allocation
>> +	 * so that we don't have to free MR names one-by-one in
>> +	 * tsm_mr_free_attribute_group()
>> +	 */
>> +	struct bin_attribute **bas __free(kfree) =
>> +		kzalloc(sizeof(*bas) * (tm->nr_mrs + 1) + nlen, GFP_KERNEL);
>> +	struct tm_context *ctx __free(kfree) =
>> +		kzalloc(struct_size(ctx, mrs, tm->nr_mrs), GFP_KERNEL);
>> +	char *name, *end;
>> +
>> +	if (!ctx || !bas)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	/* @bas is followed immediately by MR name strings */
>> +	name = (char *)&bas[tm->nr_mrs + 1];
> 
> I looked for a helper macro for a "buffer at the end of a structure",
> but could not immediately find one. It feels like something Linux should
> already have.
> 
Given a pointer some_struct_p, the end of it will be (some_struct_p + 1) 
or &some_struct_p[1]. It'd be more readable to be wrapped by a macro for 
sure.

>> +	end = name + nlen;
>> +
>> +	for (size_t i = 0; i < tm->nr_mrs; ++i) {
>> +		bas[i] = &ctx->mrs[i];
>> +		sysfs_bin_attr_init(bas[i]);
>> +
>> +		if (tm->mrs[i].mr_flags & TSM_MR_F_NOHASH)
>> +			bas[i]->attr.name = tm->mrs[i].mr_name;
>> +		else if (name < end) {
>> +			bas[i]->attr.name = name;
>> +			name += snprintf(name, end - name, "%s:%s",
>> +					 tm->mrs[i].mr_name,
>> +					 hash_algo_name[tm->mrs[i].mr_hash]);
>> +			++name;
>> +		} else
>> +			return ERR_PTR(-EINVAL);
>> +
>> +		/* check for duplicated MR definitions */
>> +		for (size_t j = 0; j < i; ++j)
>> +			if (!strcmp(bas[i]->attr.name, bas[j]->attr.name))
>> +				return ERR_PTR(-EINVAL);
>> +
>> +		if (tm->mrs[i].mr_flags & TSM_MR_F_READABLE) {
>> +			bas[i]->attr.mode |= 0444;
>> +			bas[i]->read_new = tm_digest_read;
>> +		}
>> +
>> +		if (tm->mrs[i].mr_flags & TSM_MR_F_WRITABLE) {
>> +			bas[i]->attr.mode |= 0220;
> 
> Typical expectation for writable attributes is 0200.
> 
Will change.

[...]

>> +/**
>> + * tsm_mr_free_attribute_group() - frees the attribute group returned by
>> + * tsm_mr_create_attribute_group()
>> + * @attr_grp: attribute group returned by tsm_mr_create_attribute_group()
>> + *
>> + * Context: Process context.
>> + */
>> +void tsm_mr_free_attribute_group(const struct attribute_group *attr_grp)
>> +{
> 
> Related to the removal of __must_check add safety here for cases where
> someone passes in an ERR_PTR():
> 
> 	if (IS_ERR_OR_NULL(attr_grp)
> 		return;
> 
> This also makes the function amenable to scope-based cleanup.
> 
Will change.

[...]

>> +/**
>> + * struct tsm_measurement_register - describes an architectural measurement
>> + * register (MR)
>> + * @mr_name: name of the MR
>> + * @mr_value: buffer containing the current value of the MR
>> + * @mr_size: size of the MR - typically the digest size of @mr_hash
>> + * @mr_flags: bitwise OR of one or more flags, detailed below
>> + * @mr_hash: optional hash identifier defined in include/uapi/linux/hash_info.h.
>> + *
>> + * A CC guest driver encloses an array of this structure in struct
>> + * tsm_measurements to detail the measurement facility supported by the
>> + * underlying CC hardware.
>> + *
>> + * @mr_name and @mr_value must stay valid until this structure is no longer in
>> + * use.
>> + *
>> + * @mr_flags is the bitwise-OR of zero or more of the flags below.
>> + *
>> + * * %TSM_MR_F_READABLE - the sysfs attribute corresponding to this MR is readable.
>> + * * %TSM_MR_F_WRITABLE - the sysfs attribute corresponding to this MR is writable.
>> + * * %TSM_MR_F_LIVE - this MR's value may differ from the last value written, so
>> + *   must be read back from the underlying CC hardware/firmware.
> 
> Maybe use the word "extend" somewhere in this description for clarity.
> 
Will clarify.

[...]

>> +struct tsm_measurements {
>> +	const char *name;
>> +	const struct tsm_measurement_register *mrs __counted_by(nr_mrs);
> 
> I had assumed that __counted_by() is only for inline flexible arrays,
> not out-of-line dynamically allocated arrays. Are you sure this does
> what you expect?
> 
Thanks for pointing this out! I misunderstood __counted_by, and will 
remove it.

-Cedric