lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 21 Mar 2014 14:14:23 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Eric Dumazet' <eric.dumazet@...il.com>,
	Andi Kleen <andi@...stfloor.org>,
	"H. Peter Anvin" <hpa@...or.com>
CC:	Patrick McHardy <kaber@...sh.net>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	"H.K. Jerry Chu" <hkchu@...gle.com>,
	"Michael Dalton" <mwdalton@...gle.com>,
	netdev <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [RFC] csum experts, csum_replace2() is too expensive

From: Eric Dumazet
> On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> > Eric Dumazet <eric.dumazet@...il.com> writes:
> > >
> > > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > > is insane...
> >
> >
> > Couldn't it just be the cache miss?
> 
> Or the fact that we mix 16 bit stores and 32bit loads ?
> 
> BTW, any idea why ip_fast_csum() on x86 contains a "memory" constraint ?

The correct constraint would be one that told gcc that it
accesses the 20 bytes from the source pointer.

Without it gcc won't necessarily write out the values before
the asm instructions execute.

	David

Powered by blists - more mailing lists