lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1612142225130.5283@nanos>
Date:   Wed, 14 Dec 2016 22:40:24 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Roland Scheidegger <rscheidegger_lists@...peed.ch>
cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Bruce Schlobohm <bruce.schlobohm@...el.com>,
        Kevin Stanton <kevin.b.stanton@...el.com>,
        Allen Hung <allen_hung@...l.com>
Subject: Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent
 TSC deadline timer irq storm

On Wed, 14 Dec 2016, Thomas Gleixner wrote:
> Positive space, results in timer not firing anymore - at least not in a
> time frame you are willing to wait for.
> 
>      0x0000 0000 8000 0000
> 
> Negative space, results in an interrupt storm.
> 
>      0xffff ffff 0000 0000
>      0xffff fffe 0000 0000
>      0xffff fffd 0000 0000
>      0xffff fffc 0000 0000
>      0xffff fffb 0000 0000
>      ....
> 
> These points are independent of the underlying counter value (cold boot,
> warm boot) and even reproduce after hours of power on reliably.
> 
> And looking at the values makes me wonder about 32bit vs. 64bit wreckage
> combined with sign expansion done wrong. Im really impressed!

And the whole mess stems from the fact that the deadline is not as one
would expect simply compared against the sum of the counter and the adjust
MSR.

No, they subtract the adjust value from the MSR when you write the deadline
and latch the result to compare it against the counter.

So when the following happens:

   ADJUST	= 0
   RDTSC	= 10000000 
   DEADLINE	= 11000000

   ADJUST	=  1000000

   INTERRUPT
   RDTSC	= 12000000

   DEADLINE	= 13000000

   ADJUST	=        0

   INTERRUPT
   RDTSC	= 12000000

So depending on the direction of the adjustment the timer fires late or
early.

Combined with that math wreckage this is a complete disaster. And of course
nothing is documented anywhere and the SDM is outright wrong:

10.5.4.1 TSC-Deadline Mode

  The processor generates a timer interrupt when the value of time-stamp
  counter is greater than or equal to that of IA32_TSC_DEADLINE. It then
  disarms the timer and clear the IA32_TSC_DEADLINE MSR. (Both the time-stamp
  counter and the IA32_TSC_DEADLINE MSR are 64-bit unsigned integers.)

See the example above. 1200000 is neither equal nor greater than 1300000, at
least not in my universe.

I serioulsy doubt that Intel manages it to design at least ONE functional
non broken timer before I retire.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ