lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190906161735.GH19008@zn.tnic>
Date:   Fri, 6 Sep 2019 18:17:35 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Johannes Erdfelt <johannes@...felt.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Mihai Carabas <mihai.carabas@...cle.com>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Jon Grimm <Jon.Grimm@....com>, kanth.ghatraju@...cle.com,
        konrad.wilk@...cle.com, patrick.colp@...cle.com,
        Tom Lendacky <thomas.lendacky@....com>,
        x86-ml <x86@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/microcode: Add an option to reload microcode even if
 revision is unchanged

On Fri, Sep 06, 2019 at 08:46:18AM -0700, Johannes Erdfelt wrote:
> That said, we very much rely on late microcode loading and it has helped
> us and our customers significantly.

You do realize that you rely on an update method which *won't* work in
all possible cases and then you *will* have to reboot if the microcode
patching *must* happen early, do you?

> It's really easy to say "fix your infrastructure" when you're not
> running that infrastructure.

I'm not saying you should fix your infrastructure now - I'm saying you
should keep that in mind when thinking whether to rely more on late
loading or not. Who knows, maybe newer generation machines in the fleet
could do load balancing, live migration, whatever fancy new cloud stuff
it is, to facilitate a proper reboot.

Or someone could rewrite arch/x86/ to rediscover new features upon a
microcode reload or a feature disabling. And do that in a clean way. Who
knows...

> Reboots suck. Customers hate it. Operations hates it. When you get into
> the number of hosts we have, you run into all kinds of weird failure
> scenarios. (What do you mean that the NIC that was working just fine
> before the reboot is no longer seen on the PCI bus?)

Yeah, I've heard all the stories.

> The more reboots we can avoid, the better it is for us and our
> customers.

So how do you update the kernels on those machines? Or you live-patch in
the new functionality too?

> I understand that it could be unsafe to late load some rare microcode
> updates (theoretical or not). However, that is certainly the exception.
> We have done this multiple times on our fleet and we plan to continue
> doing so in the future.

The fact that it has worked for you does not make it right. It won't
magically become safe, as tglx said.

But since you do custom development, you should be fine, it seems.

Practically speaking, late loading probably won't disappear as it is
being used apparently. Just don't expect that it will get "extended" if
that extension brings with itself fallout and duct tape fixes left and
right.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ