lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1395674087.12610.44.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Mon, 24 Mar 2014 08:14:47 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Laight <David.Laight@...LAB.COM>
Cc:	David Miller <davem@...emloft.net>,
	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
	"hkchu@...gle.com" <hkchu@...gle.com>,
	"mwdalton@...gle.com" <mwdalton@...gle.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next] net: optimize csum_replace2()

On Mon, 2014-03-24 at 14:38 +0000, David Laight wrote:
> But C only has 32bit maths, so all the 16bit values keep
> need masking.
> 
> > But again if you prefer m ^ 0xffff, thats up to you ;)
> 
> On ppc I get much better code for (retyped):
> static inline u32 csum_add(u32 c, u32 a)
> {
> 	c += a'
> 	return (c + (c >> 16)) & 0xffff;
> }
> void csum_repl(u16 *c16, u16 o16, u16 n)
> {
> 	u32 c = *c16, o = o16;
> 	*c16 = csum_add(csum_add(c ^ 0xffff, o ^ 0xffff), n);
> }
> 13 instructions and no branches against 21 instructions
> and 2 branches.
> The same is probably true of arm.
> For amd64 (where the compiler can do a 16bit compare)
> my version is one instruction longer.
> I've not tried to guess at the actual cycle counts!
> 
> 	David

Code I provided uses no conditional branch on x86.

It sounds you could provide helper to arm, if you really care of this
path. I find surprising you did not comment on my prior mails on this
subject and you suddenly seem to care now the patch is merged.

We for example have the following helper in x86 :

static inline unsigned add32_with_carry(unsigned a, unsigned b)
{
        asm("addl %2,%0\n\t"
            "adcl $0,%0"
            : "=r" (a)
            : "0" (a), "r" (b));
        return a;
}

But these days, gcc seems to do a pretty good job without these helpers.

Yes we could save 4 instructions, but I doubt it is worth the pain of
adding arch helpers. I certainly wont do it.

0000000000000030 <inet_gro_complete>:
      30:	55                   	push   %rbp
      31:	48 63 c6             	movslq %esi,%rax
      34:	48 03 87 d0 00 00 00 	add    0xd0(%rdi),%rax
      3b:	48 89 e5             	mov    %rsp,%rbp
      3e:	8b 57 68             	mov    0x68(%rdi),%edx
      41:	44 0f b7 40 02       	movzwl 0x2(%rax),%r8d
      46:	0f b7 48 0a          	movzwl 0xa(%rax),%ecx
      4a:	66 29 f2             	sub    %si,%dx
      4d:	66 c1 c2 08          	rol    $0x8,%dx
      51:	44 0f b6 48 09       	movzbl 0x9(%rax),%r9d
      56:	66 89 50 02          	mov    %dx,0x2(%rax)
      5a:	41 f7 d0             	not    %r8d
      5d:	f7 d1                	not    %ecx
      5f:	66 44 01 c1          	add    %r8w,%cx
      63:	41 0f 92 c0          	setb   %r8b
      67:	45 0f b6 c0          	movzbl %r8b,%r8d
      6b:	44 01 c1             	add    %r8d,%ecx
      6e:	66 01 d1             	add    %dx,%cx
      71:	41 0f 92 c0          	setb   %r8b
      75:	45 0f b6 c0          	movzbl %r8b,%r8d
      79:	44 01 c1             	add    %r8d,%ecx
      7c:	f7 d1                	not    %ecx
      7e:	66 89 48 0a          	mov    %cx,0xa(%rax)
      82:	49 63 c1             	movslq %r9d,%rax
      85:	48 8b 04 c5 00 00 00 	mov    0x0(,%rax,8),%rax


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ