linux-kernel - Re: PROBLEM: E6850 has an 8+ minute delay during boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <47629568.6050901@rtr.ca>
Date:	Fri, 14 Dec 2007 09:38:32 -0500
From:	Mark Lord <lkml@....ca>
To:	Arun Thomas <arun.thomas@...il.com>
Cc:	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, jirislaby@...il.com
Subject: Re: PROBLEM: E6850 has an 8+ minute delay during boot

Arun Thomas wrote:
...
> [   31.670148] TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
> [   31.670351] TCP: Hash tables configured (established 131072 bind 65536)
> [   31.670391] TCP reno registered
> [   31.681591] checking if image is initramfs...<7>Switched to high
> resolution mode on CPU 0
> [  540.133678] Clocksource tsc unstable (delta = 299978139535 ns)
> [  540.137708] Time: hpet clocksource has been installed.
> [  540.432798]  it is
> [  541.570364] Freeing initrd memory: 44428k freed
> [  541.570748] audit: initializing netlink socket (disabled)
...

Ahh... BINGO!

This is very likely the same problem that I first reported back in 2.6.21(20?),
when NOHZ and HPET first came in!

It never occurred to me to wait a full 8-minutes though,
so I just rebooted again after a 1-2 minute wait each time.

Seen on Core2Duo T7400 and Core2Quad E6600 systems, with kernels 2.6.21/22
at various points.  Not consistent -- sometimes it would boot after a few
attempts, sometimes not.

It always hung around the "Switched to high resolution mode" messages.

Thomas Gleixner wrote:
> The problem is caused by an SMI during the calibration routine. We
> really need to come up with a solid solution which does not rely on
> the periodic timer coming in, when there is something else (HPET,
> pm_timer) available.
...

Oh good, an explanation!

Now we just need a creative fix.

One thing that *might* be sufficient, would be to do the
delay loop calibration twice, and compare the results to
see if they're within 10% (pick a number) of each other.

Is there a flag or something that SMI sets that we could poll
before/after the calibration?  That could be used to tell us
that it needs redoing ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/