lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d8d3959a-8c38-4ce4-885d-6a1e40219831@intel.com>
Date: Wed, 6 Nov 2024 10:28:25 -0800
From: "Chang S. Bae" <chang.seok.bae@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: <linux-kernel@...r.kernel.org>, <x86@...nel.org>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH RFC 4/7] x86/microcode/intel: Prepare for microcode
 staging

On 11/4/2024 3:16 AM, Borislav Petkov wrote:
> On Tue, Oct 01, 2024 at 09:10:39AM -0700, Chang S. Bae wrote:
>> +
>> +static bool need_staging(u64 *mmio_addrs, u64 pa)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; mmio_addrs[i] != 0; i++) {
>> +		if (mmio_addrs[i] == pa)
>> +			return false;
>> +	}
>> +	mmio_addrs[i] = pa;
> 
> This is a weird function - its name is supposed to mean it queries something
> but then it has side effects too.

I think I've carved out a bit more out of the loop and convoluted them
into a single function.

Instead, let the helper simply find the position in the array,

         static unsigned int find_pos(u64 *mmio_addrs, u64 mmio_pa)
         {
                 unsigned int i;

                 for (i = 0; mmio_addrs[i] != 0; i++) {
                         if (mmio_addrs[i] == mmio_pa)
                                 break;
                 }
                 return i;
         }

and update the array from there:

         static void stage_microcode(void)
         {
                 unsigned int pos;
                 ...
                 for_each_cpu(cpu, cpu_online_mask) {
                         /* set 'mmio_pa' from RDMSR */

                         pos = find_pos(mmio_addrs, mmio_pa);
                         if (mmio_addrs[pos] == mmio_pa)
                                 continue;

                         /* do staging */

                         mmio_addrs[pos] = mmio_pa;
                 }
                 ...
         }

Or, even the loop code can include it:

         for_each_cpu(...) {
                 /* set 'mmio_pa' from RDMSR */

                 /* Find either the last position or a match */
                 for (i = 0; mmio_addrs[i] != 0 && mmio_addrs[i] != 
mmio_pa; i++);

                 if (mmio_addrs[i] == mmio_pa)
                         continue;

                 /* do staging */

                 mmio_addrs[i] = mmio_pa;
         }

Maybe, if this unusual one line for-loop is not an issue, this can make
the code simple. Otherwise, at least the helper should remain doing one
thing.

    >> +	for_each_cpu(cpu, cpu_online_mask) {
> 
> Oh great, and someone went and offlined one of those CPUs right here. Fun.

Yes, offlining matters. Let me summarize a few things related to the
patchset for the record:

   *  The staging flow is tiggered by load_late_locked(), which already
      holds cpus_read_lock() [1].

   *  When any SMT primary threads is offlined, setup_cpus() makes it
      exit right away [2].

   *  When an offlined CPU is brought up online, it follows the early
      loading path, which does not include staging.

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/microcode/core.c#n705
[2]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/microcode/core.c#n658

> 
>> +		mmio_pa = staging_addr(cpu);
>> +
>> +		if (need_staging(mmio_addrs, mmio_pa) &&
>> +		    !staging_work(mmio_pa, ucode_patch_late, totalsize)) {
> 
> do_stage()
> 
>> +			pr_err("Error: staging failed.\n");
> 
> 							... on CPU%d, err_val: 0x%x"\n"
> 
> perhaps?
> 
> For more info debugging something like that?

Maybe, either one of two ways:

The work function remains returning bool value, then it can print the
error message from there.

         bool do_stage(...)
         {
                 ...
                 pr_err("Error: staging failed on CPU%d, with %s.\n",
                         smp_processor_id(), state == UCODE_TIMEOUT ? 
"timeout" : "error");
                 return false
         }

Then, this function may simply return on error.

         static void stage_microcode(void)
         {
                 ...
                 if (!do_stage(...))
                         goto out;
                 ...
         out:
                 kfree(mmio_addrs);
         }

Or, let the work function return a value like an enum ucode_state
indicating the status, and print the error message here.

         static void stage_microcode(void)
         {
                 enum ucode_state state;

                 ...
                 state = do_stage(...);
                 if (state != UCODE_OK)
                         goto err_out;

         err_out:
                 pr_err("Error: ...);
         out:
                 ...
         }

> 
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	pr_info("Staging succeeded.\n");
> 
> 		"Staging of patch revision 0x%x succeeded.\n"...
> 
> more user-friendly.

Yes, that seems to be useful to put it on:

+       pr_info("Staging of patch revision 0x%x succeeded.\n",
+               ((struct microcode_header_intel *)ucode_patch_late)->rev);

If successful, users will see something like this:

[   25.352716] microcode: Staging is available.
[   25.357684] microcode: Current revision: 0x1234
[  136.203269] microcode: Staging of patch revision 0xabcd succeeded.
[  137.398491] microcode: load: updated on xx primary CPUs with yy siblings
[  137.427386] microcode: revision: 0x1234 -> 0xabcd

Thanks,
Chang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ