[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d8d3959a-8c38-4ce4-885d-6a1e40219831@intel.com>
Date: Wed, 6 Nov 2024 10:28:25 -0800
From: "Chang S. Bae" <chang.seok.bae@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: <linux-kernel@...r.kernel.org>, <x86@...nel.org>, <tglx@...utronix.de>,
<mingo@...hat.com>, <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH RFC 4/7] x86/microcode/intel: Prepare for microcode
staging
On 11/4/2024 3:16 AM, Borislav Petkov wrote:
> On Tue, Oct 01, 2024 at 09:10:39AM -0700, Chang S. Bae wrote:
>> +
>> +static bool need_staging(u64 *mmio_addrs, u64 pa)
>> +{
>> + unsigned int i;
>> +
>> + for (i = 0; mmio_addrs[i] != 0; i++) {
>> + if (mmio_addrs[i] == pa)
>> + return false;
>> + }
>> + mmio_addrs[i] = pa;
>
> This is a weird function - its name is supposed to mean it queries something
> but then it has side effects too.
I think I've carved out a bit more out of the loop and convoluted them
into a single function.
Instead, let the helper simply find the position in the array,
static unsigned int find_pos(u64 *mmio_addrs, u64 mmio_pa)
{
unsigned int i;
for (i = 0; mmio_addrs[i] != 0; i++) {
if (mmio_addrs[i] == mmio_pa)
break;
}
return i;
}
and update the array from there:
static void stage_microcode(void)
{
unsigned int pos;
...
for_each_cpu(cpu, cpu_online_mask) {
/* set 'mmio_pa' from RDMSR */
pos = find_pos(mmio_addrs, mmio_pa);
if (mmio_addrs[pos] == mmio_pa)
continue;
/* do staging */
mmio_addrs[pos] = mmio_pa;
}
...
}
Or, even the loop code can include it:
for_each_cpu(...) {
/* set 'mmio_pa' from RDMSR */
/* Find either the last position or a match */
for (i = 0; mmio_addrs[i] != 0 && mmio_addrs[i] !=
mmio_pa; i++);
if (mmio_addrs[i] == mmio_pa)
continue;
/* do staging */
mmio_addrs[i] = mmio_pa;
}
Maybe, if this unusual one line for-loop is not an issue, this can make
the code simple. Otherwise, at least the helper should remain doing one
thing.
>> + for_each_cpu(cpu, cpu_online_mask) {
>
> Oh great, and someone went and offlined one of those CPUs right here. Fun.
Yes, offlining matters. Let me summarize a few things related to the
patchset for the record:
* The staging flow is tiggered by load_late_locked(), which already
holds cpus_read_lock() [1].
* When any SMT primary threads is offlined, setup_cpus() makes it
exit right away [2].
* When an offlined CPU is brought up online, it follows the early
loading path, which does not include staging.
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/microcode/core.c#n705
[2]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/microcode/core.c#n658
>
>> + mmio_pa = staging_addr(cpu);
>> +
>> + if (need_staging(mmio_addrs, mmio_pa) &&
>> + !staging_work(mmio_pa, ucode_patch_late, totalsize)) {
>
> do_stage()
>
>> + pr_err("Error: staging failed.\n");
>
> ... on CPU%d, err_val: 0x%x"\n"
>
> perhaps?
>
> For more info debugging something like that?
Maybe, either one of two ways:
The work function remains returning bool value, then it can print the
error message from there.
bool do_stage(...)
{
...
pr_err("Error: staging failed on CPU%d, with %s.\n",
smp_processor_id(), state == UCODE_TIMEOUT ?
"timeout" : "error");
return false
}
Then, this function may simply return on error.
static void stage_microcode(void)
{
...
if (!do_stage(...))
goto out;
...
out:
kfree(mmio_addrs);
}
Or, let the work function return a value like an enum ucode_state
indicating the status, and print the error message here.
static void stage_microcode(void)
{
enum ucode_state state;
...
state = do_stage(...);
if (state != UCODE_OK)
goto err_out;
err_out:
pr_err("Error: ...);
out:
...
}
>
>> + goto out;
>> + }
>> + }
>> +
>> + pr_info("Staging succeeded.\n");
>
> "Staging of patch revision 0x%x succeeded.\n"...
>
> more user-friendly.
Yes, that seems to be useful to put it on:
+ pr_info("Staging of patch revision 0x%x succeeded.\n",
+ ((struct microcode_header_intel *)ucode_patch_late)->rev);
If successful, users will see something like this:
[ 25.352716] microcode: Staging is available.
[ 25.357684] microcode: Current revision: 0x1234
[ 136.203269] microcode: Staging of patch revision 0xabcd succeeded.
[ 137.398491] microcode: load: updated on xx primary CPUs with yy siblings
[ 137.427386] microcode: revision: 0x1234 -> 0xabcd
Thanks,
Chang
Powered by blists - more mailing lists