lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Dec 2016 17:34:11 +0100
From:   Roland Scheidegger <rscheidegger_lists@...peed.ch>
To:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     x86@...nel.org, Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Bruce Schlobohm <bruce.schlobohm@...el.com>,
        Kevin Stanton <kevin.b.stanton@...el.com>,
        Allen Hung <allen_hung@...l.com>
Subject: Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent
 TSC deadline timer irq storm

Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
> Roland reported interesting TSC ADJUST register wreckage on his DELL
> machine, which seems to populate that MSR with a random number generator.

FWIW, I thought about the actual values some more and I don't actually
think they are all that random any more: the behavior is consistent with
the bios trying to zero the TSC of all cpus. If I understand this right,
writing a zero to TSC would cause somewhat small negative values in the
TSC_ADJ register at boot time, and larger negative values at suspend
time (at least if the TSC just stops when suspended and isn't reset) -
exactly what I'm seeing.
(And of course the different TSC_ADJ values would be because the bios is
writing TSC without any thoughts of synchronization, just one cpu after
another).

> 
> Deeper investagation into fixing this wreckage unearthed another special
> feature which is designed by Intel: Negative TSC adjuste values cause
> interrupt storms on the TSC deadline timer. Further details in patch 2/2

This actually looks like quite a serious hw bug to me, shouldn't there
be an errata for such a bug?

And I still don't quite understand why the lockup doesn't happen after a
warm boot, there must be something different there...

(I didn't have the chance to test the patch yet.)

Roland


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ