lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.02.1205242013490.3231@ionos>
Date:	Thu, 24 May 2012 20:18:07 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	"Luck, Tony" <tony.luck@...el.com>
cc:	Chen Gong <gong.chen@...ux.intel.com>,
	"bp@...64.org" <bp@...64.org>, "x86@...nel.org" <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: RE: [PATCH] x86: auto poll/interrupt mode switch for CMC to stop
 CMC storm

On Thu, 24 May 2012, Luck, Tony wrote:

> > So can you please explain how this is better than having this strict
> > per cpu and avoid all the mess which comes with that patch? The
> > approach of letting global state be modified in a random manner is
> > just doomed.
> 
> Well doomed sounds bad :-) ... and I think I now agree that we should
> get rid of global state and have polling vs. CMCI mode be per-cpu. It
> means that it will take fractionally longer to react to a storm, but
> on the plus side we'll naturally set storm mode on just the cpus
> that are seeing it on a multi-socket system without having to check
> topology data ... which should be better for the case where a noisy
> source of CMCI is plaguing one socket, while other sockets have some
> much lower rate of CMCI that we'd still like to log.

I thought more about it - see my patch. So I have a global state now
as well, but it's only making sure that stuff stays in poll mode as
long as others are in poll mode. That's good I think as you avoid the
following:

cmcis which affect siblings or a socket are delivered to all affected
cores, but only one core might see the bank. So all others would
reenable fast and then switch back to polling because the storm still
persists. This would ping pong so, we probably want to avoid it.

Ideally the storm_on_cpus variable should be per socket and not system
wide, but we can do that when it really becomes an issue.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ