linux-kernel - Re: [PATCH] x86 microcode: work_on_cpu and cleanup of the synchronization logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b647ffbd0904240711v6e6930f5ia438bdcd69a015a4@mail.gmail.com>
Date:	Fri, 24 Apr 2009 16:11:10 +0200
From:	Dmitry Adamushko <dmitry.adamushko@...il.com>
To:	Hugh Dickins <hugh@...itas.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Andreas Herrmann <andreas.herrmann3@....com>,
	Peter Oruba <peter.oruba@....com>,
	Arjan van de Ven <arjan@...radead.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86 microcode: work_on_cpu and cleanup of the 
	synchronization logic

2009/4/24 Hugh Dickins <hugh@...itas.com>:
> On Fri, 24 Apr 2009, Dmitry Adamushko wrote:
>> 2009/4/23 Hugh Dickins <hugh@...itas.com>:
>> >
>> > I guess your mutex Synchronization works out, but are interrupts
>> > still disabled around the critical wrmsr()s, wherever they're getting
>> > called from?
>>
>> Yes, *msr() calls are only done from functions that are now being
>> called via smp_call_function_single(). The later seems to always do it
>> with disabled interrupts. The only exception is mc_sysdev_resume()
>> calling  ->apply_microcode() directly but this one in turn is always
>> called with disabled interrupts.
>>
>> But now that you mentioned it I wonder if we may actually need a
>> spinlock there... can we have multi-threaded cpus/cores with (all |
>> some) shared msr registers?
>
> Good thinking, yes we can and do, unless I'm misinterpreting the
> evidence.  Though P4 Xeon and Atom startup messages give the opposite
> impression, claiming to update all cpus from lower revision, more
> careful tests starting from "maxcpus=1" and then "echo 1 >online"
> (which, unless you've fiddled around putting the microcode_ctl'ed
> microcode.dat into /lib/firmware/intel-ucode/wherever, isn't able
> to update at online time on Intel) shows that the later onlined
> siblings already have the updated microcode applied to their
> previously onlined siblings.  Which isn't surprising, but I'd
> been lulled into thinking the opposite by the startup sequence.

Ah, stupid me :-/ These differences in behavior during the startup and
the later update reveal a real bug in my patch.

this part:

mutex_lock(&microcode_mutex);
error = sysdev_driver_register(&cpu_sysdev_class, &mc_sysdev_driver);
mutex_unlock(&microcode_mutex);

sysdev_driver_register() calls mc_sysdev_driver's ->add() (which is
mc_sysdev_add()) for each cpu in a loop. Obviously, "microcode_mutex"
can't help to serialize these calls, oops. A very obvious thing but I
missed it.

>
> Please add "HT versus not" to my earlier list of confusions.
>
> microcode_mutex still covers most cases: is it the case of onlining
> two threads at the same time that slips through?  Is that permitted
> at the outer level?

If the threads are onlined with cpu_up() then it should be ok - no
concurrent cpu_up()s are allowed. I'll check it out.

> Though even if it is, I'd feel safest to have
> the spin_lock_irqsaves back (or if not, comment to say why not needed).

I'll verify regarding the initialization of HT threads (I'd imagine
that it's indeed via cpu_up(), at the very least for the sake of
consistency as they pretend to be 'normal' cpus to upper layers, e.g.
can be offline/online-ed).

I'm also thinking if the synchronization with "microcode_mutex" is way
too strong/restrictive in this case. Perhaps we actually can add some
parallelism here (with spinlocks in arch-specific parts only where
necessary).

On the other hand, I think that we can optimize cases when a few cpus
are being updated one after another (upon modprobe microcode or
writing into /dev/microcode).

Assumption: most of the CPUs (maybe with an exception of the boot-cpu
when its ucode is updated by BIOS) upgrade from revisions A to B,
where A and B are the same for all of them (well, at least B -- the
most recent one [*]).

Then why bother loading/traversing firmware (or traversing .dat files)
for each of them?

[*] btw., are all CPUs on SMP systems similar wrt model, stepping?

Even if not, we could do some caching so that if cpu-2 asks for
intel-ucode/06-0f-0a and we know that cpu-1 has just done the same and
still has a proper ucode in its buffer, then we just make a copy.

>
> Hugh
>

-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/