linux-kernel - Re: [PATCH v4 4/6] x86/microcode/intel: Implement staging handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <002f259d-2f20-4428-add3-a02650bc728b@intel.com>
Date: Wed, 13 Aug 2025 11:44:11 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: "Chang S. Bae" <chang.seok.bae@...el.com>, x86@...nel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, colinmitchell@...gle.com, chao.gao@...el.com,
 abusse@...zon.de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 4/6] x86/microcode/intel: Implement staging handler

> +/*
> + * Determine if the next data chunk can be sent. Each chunk is typically
> + * one page unless the remaining data is smaller. If the total
> + * transmitted data exceeds the defined limit, a timeout occurs.
> + */

This comment isn't really telling the whole story. It's not just
determining if the chunk can be sent, it's calculating it and filling it in.

> +static bool can_send_next_chunk(struct staging_state *ss)
> +{
> +	WARN_ON_ONCE(ss->ucode_len < ss->offset);

Please don't WARN_ON() they can be fatal because of panic_on_warn. Also
I think this is the wrong spot for this. We should enforce this at the
time ss->offset is _established_ which is oddly enough in the next patch.

	ss->offset = read_mbox_dword(ss->mmio_base);
	if (ss->offset > ss->ucode_len)
		// error out

> +	ss->chunk_size = min(MBOX_XACTION_SIZE, ss->ucode_len - ss->offset);

It's a _little_ non-obvious that "can_send_next_chunk()" is also setting
->chunk_size. It would be easier to grok if it was something like:

	ok = calc_next_chunk_size(&ss);
	if (!ok)
		// error out

> +	if (ss->bytes_sent + ss->chunk_size > MBOX_XACTION_MAX(ss->ucode_len)) {
> +		ss->state = UCODE_TIMEOUT;
> +		return false;
> +	}

"TIMEOUT" seems like an odd thing to call this failure. Can you explain
the choice of this error code a bit, please?

> +/*
> + * Handle the staging process using the mailbox MMIO interface. The
> + * microcode image is transferred in chunks until completion. Return the
> + * result state.
>   */
>  static enum ucode_state do_stage(u64 mmio_pa)
>  {
> -	pr_debug_once("Staging implementation is pending.\n");
> -	return UCODE_ERROR;
> +	struct staging_state ss = {};
> +
> +	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
> +	if (WARN_ON_ONCE(!ss.mmio_base))
> +		return UCODE_ERROR;
> +
> +	init_stage(&ss);
> +
> +	/* Perform the staging process while within the retry limit */
> +	while (!is_stage_complete(ss.offset) && can_send_next_chunk(&ss)) {
> +		/* Send a chunk of microcode each time: */
> +		if (!send_data_chunk(&ss))
> +			break;
> +		/*
> +		 * Then, ask the hardware which piece of the image it
> +		 * needs next. The same piece may be sent more than once.
> +		 */
> +		if (!fetch_next_offset(&ss))
> +			break;
> +	}

The return types here are a _bit_ wonky. The 'bool' returns make sense
for things like is_stage_complete(). But they don't look right for:

	if (!send_data_chunk(&ss))
		break;

where we'd typically use an -ERRNO and where 0 mean success. It would
look something like this:

	while (!staging_is_complete(&ss)) {
		err = send_data_chunk(&ss);
		if (err)
			break;

		err = fetch_next_offset(&ss);
		if (err)
			break;
	}

That's utterly unambiguous about the intent and what types the send and
fetch function _must_ have.

Note I also moved the can_send_next_chunk() call into
staging_is_complete(). I think that makes sense as well for the
top-level loop.