lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Dec 2016 12:46:12 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     LKML <linux-kernel@...r.kernel.org>
cc:     x86@...nel.org, Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Bruce Schlobohm <bruce.schlobohm@...el.com>,
        Roland Scheidegger <rscheidegger_lists@...peed.ch>,
        Kevin Stanton <kevin.b.stanton@...el.com>,
        Allen Hung <allen_hung@...l.com>, stable@...r.kernel.org
Subject: Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >=
 zero

On Tue, 13 Dec 2016, Thomas Gleixner wrote:
> Roland reported that his DELL T5810 sports a value add BIOS which
> completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
> random negative TSC_ADJUST values, different on all CPUs. That renders the
> TSC useless because the sycnchronization check fails.

While everyone assumed that this is the usual DELL squirmware problem, I
have to say it's not.

Just got my hands on a Skylake based Lenovo S510 box and it shows the same
feature:

TSC ADJUST: CPU0: -10123656703215
    	    CPU1: -10123656796701
	    CPU2: -10123656797460
	    CPU3: -10123656798366

Which causes the TSC to be out of sync on a stock upstream kernel and the
TSC deadline timer wreckage is happening on that machine as well.

I'm pretty sure, that this well thought out feature to 'hide power on time'
from TSC has not been independently 'invented' by DELL and Lenovo BIOS
tinkerers.

I rather have the impression that this is an advisory or feature kit from
some other entity. Whoever came up with this misfeature at Intel and/or
Microsoft (sorry, I could not come up with any other suspects) should be
promoted to run the 'Linux on feature-plagued systems' hot line.

As this seems to be more wide spread than we thought initially, we have to
think about a solution for stable kernels, especially 4.9. And distros will
have to think about that as well....

We have two options:

1) Disable TSC deadline timer by default and force users with sane machines
   to enable it on the kernel command line.

   Upside:   Very small patch
   
   Downside: Degrades existing setups on sane machines, keeps TSC unusable
      	     on affected machines. We have no idea what other hidden side
      	     effects the TSC_ADJUST tinkering has. If there are any, they
      	     ain't be nice ones.

2) Push the whole TSC_ADJUST sanitizing machinery into stable

   Upside:   Does not affect sane machines and gives a benefit to users of
   	     affected machines

   Downside: Rather large patch, but not that risky either. Needs a few
	     eyes and good test coverage though

Thoughts?

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ