lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1907251127430.1791@nanos.tec.linutronix.de>
Date:   Thu, 25 Jul 2019 11:37:46 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Rui Salvaterra <rsalvaterra@...il.com>
cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Daniel Drake <drake@...lessm.com>
Subject: Re: [BUG] Linux 5.3-rc1: timer problem on x86-64 (Pentium D)

Rui,

On Thu, 25 Jul 2019, Rui Salvaterra wrote:
> On Thu, 25 Jul 2019 at 07:28, Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > The only reason I can think of is that the HPET on that machine has a weird
> > register state (it's not advertised by the BIOS ... )
> >
> > But that does not explain the boot failure completely. If the HPET is not
> > available then the kernel should automatically do the right thing and fall
> > back to something else.
> 
> This may be a useful data point, the relevant part of the dmesg on a
> pristine 5.3-rc1 with clocksource=jiffies:

Duh. Yes, this explains it nicely.

> [    1.123548] clocksource: timekeeping watchdog on CPU1: Marking
> clocksource 'tsc-early' as unstable because the skew is too large:
> [    1.123552] clocksource:                       'hpet' wd_now: 33
> wd_last: 33 mask: ffffffff

The HPET counter check succeeded, but the early enable and the following
reconfiguration confused it completely. So the HPET is not counting:

	'hpet' wd_now: 33 wd_last: 33 mask: ffffffff

Which is a full explanation for the boot fail because if the counter is not
working then the HPET timer is not expiring and the early boot is waiting
for HPET to fire forever.

> > Then boot these kernels with 'hpet=disable' on the command line and see
> > whether they come up. If so please provide the same output.
> 
> Fortunately (as I'm doing this remotely) they did come up.
> With hpet=disabled…
> 
> Linux 5.2:
> available_clocksource: tsc acpi_pm
> current_clocksource: tsc
> 
> Linux 5.3-rc1 patched:
> available_clocksource: tsc acpi_pm
> current_clocksource: tsc

That's consistent with the above. 5.3-rc1 unpatched would of course boot as
well with hpet=disable now that we know the root cause.

I'll write a changelog and route it to Linus for -rc2.

Thanks a lot for debugging this and providing all the information!

       tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ