lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Feb 2008 22:56:45 -0500 (EST)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
cc:	paulus@...ba.org, linuxppc-dev@...abs.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] add strncmp to PowerPC


On Sat, 1 Mar 2008, Benjamin Herrenschmidt wrote:
>
> Do we have any indication that it performs better than the C one ?

See below.

>
> Ben.
>

> >
> > +_GLOBAL(strncmp)
> > +	mtctr	r5
> > +	addi	r5,r3,-1
> > +	addi	r4,r4,-1
> > +1:	lbzu	r3,1(r5)
> > +	cmpwi	1,r3,0
> > +	lbzu	r0,1(r4)
> > +	subf.	r3,r0,r3
> > +	beqlr	1
> > +	bdnzt	eq,1b
> > +	blr
> > +


And here's the objdump of the C version:

0000000000000080 <.strncmp>:
  80:   fb e1 ff f0     std     r31,-16(r1)
  84:   f8 21 ff c1     stdu    r1,-64(r1)
  88:   7c 69 1b 78     mr      r9,r3
  8c:   7c a0 2b 79     mr.     r0,r5
  90:   38 60 00 00     li      r3,0
  94:   7c 09 03 a6     mtctr   r0
  98:   7c 3f 0b 78     mr      r31,r1
  9c:   41 82 00 68     beq-    104 <.strncmp+0x84>
  a0:   89 69 00 00     lbz     r11,0(r9)
  a4:   88 04 00 00     lbz     r0,0(r4)
  a8:   7c 00 58 50     subf    r0,r0,r11
  ac:   78 00 06 20     clrldi  r0,r0,56
  b0:   2f a0 00 00     cmpdi   cr7,r0,0
  b4:   7c 00 07 74     extsb   r0,r0
  b8:   7c 03 03 78     mr      r3,r0
  bc:   40 9e 00 48     bne-    cr7,104 <.strncmp+0x84>
  c0:   2f ab 00 00     cmpdi   cr7,r11,0
  c4:   41 9e 00 40     beq-    cr7,104 <.strncmp+0x84>
  c8:   38 84 00 01     addi    r4,r4,1
  cc:   38 69 00 01     addi    r3,r9,1
  d0:   42 40 00 30     bdz-    100 <.strncmp+0x80>
  d4:   88 03 00 00     lbz     r0,0(r3)
  d8:   89 24 00 00     lbz     r9,0(r4)
  dc:   38 63 00 01     addi    r3,r3,1
  e0:   38 84 00 01     addi    r4,r4,1
  e4:   2f 20 00 00     cmpdi   cr6,r0,0
  e8:   7c 09 00 50     subf    r0,r9,r0
  ec:   78 00 06 20     clrldi  r0,r0,56
  f0:   2f a0 00 00     cmpdi   cr7,r0,0
  f4:   7c 00 07 74     extsb   r0,r0
  f8:   40 9e 00 08     bne-    cr7,100 <.strncmp+0x80>
  fc:   40 9a ff d4     bne+    cr6,d0 <.strncmp+0x50>
 100:   7c 03 03 78     mr      r3,r0
 104:   e8 21 00 00     ld      r1,0(r1)
 108:   eb e1 ff f0     ld      r31,-16(r1)
 10c:   4e 80 00 20     blr


I'll let you decide ;-)

Even if it was logically faster (which I still doubt) it's a hell of a lot
of cache lines to waste.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists