lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1395667063.12610.24.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Mon, 24 Mar 2014 06:17:43 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Laight <David.Laight@...LAB.COM>
Cc:	David Miller <davem@...emloft.net>,
	"andi@...stfloor.org" <andi@...stfloor.org>,
	"hpa@...or.com" <hpa@...or.com>,
	"kaber@...sh.net" <kaber@...sh.net>,
	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
	"hkchu@...gle.com" <hkchu@...gle.com>,
	"mwdalton@...gle.com" <mwdalton@...gle.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [RFC] csum experts, csum_replace2() is too expensive

On Mon, 2014-03-24 at 10:30 +0000, David Laight wrote:
> From: Eric Dumazet
> > On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> > > From: Eric Dumazet <eric.dumazet@...il.com>
> > > Date: Fri, 21 Mar 2014 05:50:50 -0700
> > >
> > > > It looks like a barrier() would be more appropriate.
> > >
> > > barrier() == __asm__ __volatile__(:::"memory")
> > 
> > Indeed, but now you mention it, ip_fast_csum() do not uses volatile
> > keyword on x86_64, and has no "m" constraint either.
> 
> Adding 'volatile' isn't sufficient to force gcc to write data
> into the area being checksummed.

You missed the point. Its not about forcing gcc to write data, because
it does.

Point is : gcc doesn't recompute the checksum a second time.

> ip_fast_csum() either needs an explicit "m" constraint for the actual
> buffer (and target) bytes, or the stronger "memory" constraint.
> The 'volatile' is then not needed.

What about you take a look at the actual code ?

"memory" constraint is already there. And no, its not enough, otherwise
I wouldn't have sent this mail.

I actually compiled the code and double checked.

0000000000007010 <foobar>:
    7010:	e8 00 00 00 00       	callq  7015 <foobar+0x5>
			7011: R_X86_64_PC32	__fentry__-0x4
    7015:	55                   	push   %rbp
    7016:	31 c0                	xor    %eax,%eax
    7018:	b9 05 00 00 00       	mov    $0x5,%ecx
    701d:	48 89 e5             	mov    %rsp,%rbp
    7020:	48 83 ec 20          	sub    $0x20,%rsp
    7024:	48 89 5d e8          	mov    %rbx,-0x18(%rbp)
    7028:	4c 89 6d f8          	mov    %r13,-0x8(%rbp)
    702c:	48 89 fb             	mov    %rdi,%rbx
    702f:	4c 89 65 f0          	mov    %r12,-0x10(%rbp)
    7033:	41 89 d5             	mov    %edx,%r13d
    7036:	66 89 47 0a          	mov    %ax,0xa(%rdi)
    703a:	66 89 77 02          	mov    %si,0x2(%rdi)
    703e:	48 89 f8             	mov    %rdi,%rax
    7041:	48 89 fe             	mov    %rdi,%rsi
    7044:	44 8b 20             	mov    (%rax),%r12d
    7047:	83 e9 04             	sub    $0x4,%ecx
    704a:	76 2e                	jbe    707a <foobar+0x6a>
    704c:	44 03 60 04          	add    0x4(%rax),%r12d
    7050:	44 13 60 08          	adc    0x8(%rax),%r12d
    7054:	44 13 60 0c          	adc    0xc(%rax),%r12d
    7058:	44 13 60 10          	adc    0x10(%rax),%r12d
    705c:	48 8d 40 04          	lea    0x4(%rax),%rax
    7060:	ff c9                	dec    %ecx
    7062:	75 f4                	jne    7058 <foobar+0x48>
    7064:	41 83 d4 00          	adc    $0x0,%r12d
    7068:	44 89 e1             	mov    %r12d,%ecx
    706b:	41 c1 ec 10          	shr    $0x10,%r12d
    706f:	66 41 01 cc          	add    %cx,%r12w
    7073:	41 83 d4 00          	adc    $0x0,%r12d
    7077:	41 f7 d4             	not    %r12d
    707a:	31 c0                	xor    %eax,%eax
    707c:	66 44 89 67 0a       	mov    %r12w,0xa(%rdi)
    7081:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
			7084: R_X86_64_32S	.rodata.str1.1+0xabd
    7088:	e8 00 00 00 00       	callq  708d <foobar+0x7d>
			7089: R_X86_64_PC32	printk-0x4
    708d:	66 44 89 6b 02       	mov    %r13w,0x2(%rbx)
    7092:	66 44 89 63 0a       	mov    %r12w,0xa(%rbx)
    7097:	4c 8b 6d f8          	mov    -0x8(%rbp),%r13
    709b:	48 8b 5d e8          	mov    -0x18(%rbp),%rbx
    709f:	4c 8b 65 f0          	mov    -0x10(%rbp),%r12
    70a3:	c9                   	leaveq 
    70a4:	c3                   	retq   


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ