[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33d4286c-3f77-1274-34b7-bc62d2c146a4@hispeed.ch>
Date: Tue, 13 Dec 2016 17:34:11 +0100
From: Roland Scheidegger <rscheidegger_lists@...peed.ch>
To: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>
Cc: x86@...nel.org, Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
Bruce Schlobohm <bruce.schlobohm@...el.com>,
Kevin Stanton <kevin.b.stanton@...el.com>,
Allen Hung <allen_hung@...l.com>
Subject: Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent
TSC deadline timer irq storm
Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
> Roland reported interesting TSC ADJUST register wreckage on his DELL
> machine, which seems to populate that MSR with a random number generator.
FWIW, I thought about the actual values some more and I don't actually
think they are all that random any more: the behavior is consistent with
the bios trying to zero the TSC of all cpus. If I understand this right,
writing a zero to TSC would cause somewhat small negative values in the
TSC_ADJ register at boot time, and larger negative values at suspend
time (at least if the TSC just stops when suspended and isn't reset) -
exactly what I'm seeing.
(And of course the different TSC_ADJ values would be because the bios is
writing TSC without any thoughts of synchronization, just one cpu after
another).
>
> Deeper investagation into fixing this wreckage unearthed another special
> feature which is designed by Intel: Negative TSC adjuste values cause
> interrupt storms on the TSC deadline timer. Further details in patch 2/2
This actually looks like quite a serious hw bug to me, shouldn't there
be an errata for such a bug?
And I still don't quite understand why the lockup doesn't happen after a
warm boot, there must be something different there...
(I didn't have the chance to test the patch yet.)
Roland
Powered by blists - more mailing lists