lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20140714151433.GE25115@pd.tnic> Date: Mon, 14 Jul 2014 17:14:34 +0200 From: Borislav Petkov <bp@...en8.de> To: Havard Skinnemoen <hskinnemoen@...gle.com> Cc: Tony Luck <tony.luck@...il.com>, Linux Kernel <linux-kernel@...r.kernel.org>, Ewout van Bekkum <ewout@...gle.com>, linux-edac <linux-edac@...r.kernel.org> Subject: Re: [PATCH 1/6] x86-mce: Modify CMCI poll interval to adjust for small check_interval values. On Fri, Jul 11, 2014 at 05:10:07PM -0700, Havard Skinnemoen wrote: > 200ms per second means we're using 20% of that CPU. I'd say that's > definitely too much. But I like the general approach. Right. > > Yeah, by "generous" I meant, choose values which fit all. But I realize > > now that this is a dumb idea. Maybe we could measure it on each system, > > read the TSC on CMCI entry and exit and thus get an average CMCI > > duration... > > Sounds interesting. Some things that may need some more thought: > > 1. What percentage of CPU is OK to use before we consider it a storm? That is a very good question. Normally, when we don't know that answer, we leave it user-configurable with a sane default :-) But if we have to be realistic, anything above 20% of CPU time spent in storm mode for prolonged periods of time would probably mean this system needs to get scheduled for maintenance anyway. The whole storm thing is basically showing that a system is about to fail soon and we're trying to alleviate performance hit from too high CMCI counts by switching to polling, i.e., prolonged, more graceful hw fail. :-) > 2. How do we map that number to polling mode, where we may not see all > the errors? If we get it wrong, we may end up bouncing at a very high > rate. Well, with polling you're bound to miss some errors anyway. > 3. If we go for a fixed polling rate, how do we make sure it doesn't > require more CPU than what we determined in (1)? Yeah, that's the disadvantage of fixed polling rate - we won't know. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists