[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b647ffbd0904240711v6e6930f5ia438bdcd69a015a4@mail.gmail.com>
Date: Fri, 24 Apr 2009 16:11:10 +0200
From: Dmitry Adamushko <dmitry.adamushko@...il.com>
To: Hugh Dickins <hugh@...itas.com>
Cc: Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Rusty Russell <rusty@...tcorp.com.au>,
Andreas Herrmann <andreas.herrmann3@....com>,
Peter Oruba <peter.oruba@....com>,
Arjan van de Ven <arjan@...radead.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86 microcode: work_on_cpu and cleanup of the
synchronization logic
2009/4/24 Hugh Dickins <hugh@...itas.com>:
> On Fri, 24 Apr 2009, Dmitry Adamushko wrote:
>> 2009/4/23 Hugh Dickins <hugh@...itas.com>:
>> >
>> > I guess your mutex Synchronization works out, but are interrupts
>> > still disabled around the critical wrmsr()s, wherever they're getting
>> > called from?
>>
>> Yes, *msr() calls are only done from functions that are now being
>> called via smp_call_function_single(). The later seems to always do it
>> with disabled interrupts. The only exception is mc_sysdev_resume()
>> calling ->apply_microcode() directly but this one in turn is always
>> called with disabled interrupts.
>>
>> But now that you mentioned it I wonder if we may actually need a
>> spinlock there... can we have multi-threaded cpus/cores with (all |
>> some) shared msr registers?
>
> Good thinking, yes we can and do, unless I'm misinterpreting the
> evidence. Though P4 Xeon and Atom startup messages give the opposite
> impression, claiming to update all cpus from lower revision, more
> careful tests starting from "maxcpus=1" and then "echo 1 >online"
> (which, unless you've fiddled around putting the microcode_ctl'ed
> microcode.dat into /lib/firmware/intel-ucode/wherever, isn't able
> to update at online time on Intel) shows that the later onlined
> siblings already have the updated microcode applied to their
> previously onlined siblings. Which isn't surprising, but I'd
> been lulled into thinking the opposite by the startup sequence.
Ah, stupid me :-/ These differences in behavior during the startup and
the later update reveal a real bug in my patch.
this part:
mutex_lock(µcode_mutex);
error = sysdev_driver_register(&cpu_sysdev_class, &mc_sysdev_driver);
mutex_unlock(µcode_mutex);
sysdev_driver_register() calls mc_sysdev_driver's ->add() (which is
mc_sysdev_add()) for each cpu in a loop. Obviously, "microcode_mutex"
can't help to serialize these calls, oops. A very obvious thing but I
missed it.
>
> Please add "HT versus not" to my earlier list of confusions.
>
> microcode_mutex still covers most cases: is it the case of onlining
> two threads at the same time that slips through? Is that permitted
> at the outer level?
If the threads are onlined with cpu_up() then it should be ok - no
concurrent cpu_up()s are allowed. I'll check it out.
> Though even if it is, I'd feel safest to have
> the spin_lock_irqsaves back (or if not, comment to say why not needed).
I'll verify regarding the initialization of HT threads (I'd imagine
that it's indeed via cpu_up(), at the very least for the sake of
consistency as they pretend to be 'normal' cpus to upper layers, e.g.
can be offline/online-ed).
I'm also thinking if the synchronization with "microcode_mutex" is way
too strong/restrictive in this case. Perhaps we actually can add some
parallelism here (with spinlocks in arch-specific parts only where
necessary).
On the other hand, I think that we can optimize cases when a few cpus
are being updated one after another (upon modprobe microcode or
writing into /dev/microcode).
Assumption: most of the CPUs (maybe with an exception of the boot-cpu
when its ucode is updated by BIOS) upgrade from revisions A to B,
where A and B are the same for all of them (well, at least B -- the
most recent one [*]).
Then why bother loading/traversing firmware (or traversing .dat files)
for each of them?
[*] btw., are all CPUs on SMP systems similar wrt model, stepping?
Even if not, we could do some caching so that if cpu-2 asks for
intel-ucode/06-0f-0a and we know that cpu-1 has just done the same and
still has a proper ucode in its buffer, then we just make a copy.
>
> Hugh
>
--
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists