linux-kernel - Re: tsc timer related problems/questions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 10 Sep 2007 19:19:38 -0600
From:	Robert Hancock <hancockr@...w.ca>
To:	Dennis Lubert <plasmahh@....net>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: tsc timer related problems/questions

Dennis Lubert wrote:
> Hello list,
> 
> we are encountering a few behaviours regarding the ways to get accurate
> timer values under Linux that we would call bugs, and where we are
> currently stuck in further diagnosing and/or fixing.
> 
> Background: We are developing for SMP servers with up to 8 CPUs (mostly
> AMD64) and for various reasons would like to have time measurements with
> a resolution of maybe a few microseconds.
> 
> 
> - Using Kernel 2.6.20.7 and surroundings per default the TSC Timer is
> used. We are very happy with that (accuracy ~400nanoseconds) but after a
> while the system goes wild with the following message for each CPU:
> 
> [105771.523771] BUG: soft lockup detected on CPU#1!
> [105771.527869]
> [105771.527871] Call Trace:
> [105771.536079]  <IRQ>  [<ffffffff802619cc>] _spin_lock+0x9/0xb
> [105771.540294]  [<ffffffff802a6f9d>] softlockup_tick+0xd2/0xe7
> [105771.544359]  [<ffffffff8024bcbb>] run_local_timers+0x13/0x15
> [105771.548541]  [<ffffffff80289fc1>] update_process_times+0x4c/0x79
> [105771.552737]  [<ffffffff80270327>] smp_local_timer_interrupt
> +0x34/0x54
> [105771.556934]  [<ffffffff80270834>] smp_apic_timer_interrupt+0x51/0x68
> [105771.561022]  [<ffffffff80268121>] default_idle+0x0/0x42
> [105771.565199]  [<ffffffff8025cce6>] apic_timer_interrupt+0x66/0x70
> [105771.569386]  <EOI>  [<ffffffff8026814e>] default_idle+0x2d/0x42
> [105771.573597]  [<ffffffff80247929>] enter_idle+0x22/0x24
> [105771.577665]  [<ffffffff80247a92>] cpu_idle+0x5a/0x79
> [105771.581838]  [<ffffffff806bc5f7>] start_secondary+0x474/0x483
> 
> Question: Is this a known bug already or should further investigation
> take place?

It's unclear what that could be. As Arjan mentioned this can be caused 
by the BIOS going off into SMI mode for a long time. If you don't have 
ACPI turned on, doing this may prevent this from happening.

> 
> - Using Kernels from 2.6.21 on (random sampled) we experience that the
> TSC isn't used per default anymore (we usually set the nopmtimer option
> at boot for a while now). Looking briefly at the 2.6.23-rc5 code shows
> that in the function where the check is done whether the tsc is stable
> the only code path where a "is stable" result could be returned is one
> where the vendor of the CPU is detected as Intel. Instead a much slower
> timesource (10ms instead of a few us resolution, same for getting the
> time at all) is used which is totally unusable for us (Within 10ms so
> much things happen).

What time source is getting used? The best alternative is HPET, most 
newer systems are providing that now. After that, there's ACPI PM timer 
(make sure you have ACPI enabled). The worst possible fallback is the 
PIT, which from this poor resolution sounds like what it is using.

> 
> Question: Why are only Intel CPUs considered as stable? Could there be
> implemented a more sophisticated heuristic, that actually does some
> tests for tsc stability?

AMD CPUs don't seem to have synchronized TSCs across multiple CPUs. It 
seems this is the case even with different cores in the same CPU 
package. Therefore the TSC is not considered a suitable time source on 
multi-CPU AMD systems.

> - Enabling tsc explicitly as a time source via sysfs we had good results
> so far, with quit good resolution, and also various tests about
> synchronization between the CPUs didn't show any measurable changes in
> the deviation over time.
> However, once accidentally someone enabled cpufrequency scaling and
> scaled down two of four CPUs. From then on the time on the slower CPU
> was totally wrong, and all time displaying programs (simple date
> program) showed different (hours in difference) results, depending on
> which CPU they where run, so results were randomly. Programs doing a
> simple usleep() could hang (likely because the time to wakeup was
> gathered from another CPU whith time in the future). The system
> was essentially unusable and also after setting the CPUs back to the
> correct speed, things were still wrong.
> 
> Question: Is this a known problem? It looks like there is a huge problem
> in synchronizing the way the time is calculated from the TSC and the cpu
> frequency scaling, also something else seems to be buggy since also
> after setting things back even after a few seconds only, times are off
> by hours.

This is expected behavior if you force TSC usage on with CPU frequency 
scaling enabled, there's a reason we turn that off normally. (Also in 
the case where some CPUs stop the TSC in certain power-saving halt 
states.) Theoretically one could track the TSC between different CPUs 
running at different clock speeds, etc. and across halts, but it doesn't 
really seem worth the trouble, especially in cases like AMD multi-CPU 
where the TSC can't be trusted across CPUs anyway.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@...pamshaw.ca
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/