lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150429081355.GA5498@pd.tnic>
Date:	Wed, 29 Apr 2015 10:13:55 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	"Zheng, Lv" <lv.zheng@...el.com>
Cc:	linux-edac <linux-edac@...r.kernel.org>,
	Jiri Kosina <jkosina@...e.cz>, Borislav Petkov <bp@...e.de>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Len Brown <lenb@...nel.org>,
	"Luck, Tony" <tony.luck@...el.com>,
	Tomasz Nowicki <tomasz.nowicki@...aro.org>,
	"Chen, Gong" <gong.chen@...ux.intel.com>,
	Wolfram Sang <wsa@...-dreams.de>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 5/5] GHES: Make NMI handler have a single reader

On Wed, Apr 29, 2015 at 12:49:59AM +0000, Zheng, Lv wrote:
> > > We absolutely want to use atomic_add_unless() because we get to save us
> > > the expensive
> > >
> > > 	LOCK; CMPXCHG
> > >
> > > if the value was already 1. Which is exactly what this patch is trying
> > > to avoid - a thundering herd of cores CMPXCHGing a global variable.
> > 
> > IMO, on most architectures, the "cmp" part should work just like what you've done with "if".
> > And on some architectures, if the "xchg" doesn't happen, the "cmp" part even won't cause a pipe line hazard.

Even if CMPXCHG is being split into several microops, they all still
need to flow down the pipe and require resources and tracking. And you
only know at retire time what the CMP result is and can "discard" the
XCHG part. Provided the uarch is smart enough to do that.

This is probably why CMPXCHG needs 5,6,7,10,22,... cycles depending on
uarch and vendor, if I can trust Agner Fog's tables. And I bet those
numbers are best-case only and in real-life they probably tend to fall
out even worse.

CMP needs only 1. On almost every uarch and vendor. And even that cycle
probably gets hidden with a good branch predictor.

> If you man the LOCK prefix, I understand now.

And that makes several times worse: 22, 40, 80, ... cycles.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ