lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 21 Jun 2012 16:43:54 +0200 From: Peter Zijlstra <a.p.zijlstra@...llo.nl> To: Robert Richter <robert.richter@....com> Cc: Luming Yu <luming.yu@...il.com>, LKML <linux-kernel@...r.kernel.org>, tglx@...utronix.de, sfr@...b.auug.org.au, Andrew Morton <akpm@...ux-foundation.org>, jcm@...masters.org, linux-next@...r.kernel.org, Ingo Molnar <mingo@...e.hu>, torvalds@...ux-foundation.org Subject: Re: What is the right practice to get new code upstream( was Fwd: [patch] a simple hardware detector for latency as well as throughput ver. 0.1.0) On Thu, 2012-06-21 at 15:29 +0200, Robert Richter wrote: > On 14.06.12 12:04:56, Peter Zijlstra wrote: > For AMD there's only event 02Bh, which is SMIs Received. I'm not sure it > > has anything like the FREEZE or if the event is modifyable to count the > > cycles in SMI. > > Peter, which use cases do you have in mind. Is it to root cause > latencies? Or just to see what happens on the system, you long it > spends in smi mode? On current systems counting smi cycles seems not > to be possible. Yeah exactly. So we can whack vendors over the head with hard evidence their BIOS is utter shite. So what we do now is disable interrupts, run a tight TSC read loop and report fail when you see a big delta. Now some 'creative' BIOS people thought it would be a good idea to save/restore TSC over the SMI, this avoids detection. It also completely wrecks TSC sync across cores. But the SMI stuff is a real problem for -rt, this feature^Wfailure-add is a real problem, we've seen SMIs that go well above a ms in duration, which of course completely wreck the system. IIRC the worst tglx ever encountered was 0.5s or so. So ideally the PMU would have 2 events, one counting SMIs one counting cycles in SMM. Both should ignore any and all FREEZE_IN_SMM bits if such a thing exists. The hardware should also hard fail if such a counter is fiddled with from SMM context. This would give us the capability to log exactly when and for how long the system is taken from us and makes it impossible to 'fix' from SMM. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists