lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b0e5941-c37e-3c85-3809-45f33ce35657@c-s.fr>
Date:   Mon, 20 Jan 2020 18:08:23 +0100
From:   Christophe Leroy <christophe.leroy@....fr>
To:     Segher Boessenkool <segher@...nel.crashing.org>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>, nathanl@...ux.ibm.com,
        arnd@...db.de, tglx@...utronix.de, vincenzo.frascino@....com,
        luto@...nel.org, x86@...nel.org, linuxppc-dev@...ts.ozlabs.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-mips@...r.kernel.org
Subject: Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.



Le 20/01/2020 à 16:19, Segher Boessenkool a écrit :
> On Mon, Jan 20, 2020 at 02:56:00PM +0000, Christophe Leroy wrote:
>>> Nice!  Much better.
>>>
>>> It should be tested on more representative hardware, too, but this looks
>>> promising alright :-)
>>
>> mpc832x (e300c2 core) at 333 MHz:
>>
>> Before:
>>
>> gettimeofday:    vdso: 235 nsec/call
>> clock-gettime-realtime:    vdso: 244 nsec/call
>>
>> With the series:
>>
>> gettimeofday:    vdso: 271 nsec/call
>> clock-gettime-realtime:    vdso: 281 nsec/call
> 
> Those are important, and degrade ~15%.  That is acceptable IMO, but do
> you see a way to optimise this (later)?

Not easy I think.

First we have the unavoidable ASM entry function that can't be dropped 
because of the CR[SO] bit the set on error or clear on no error and that 
can't be done in C.

In our ASM VDSO, fixed shifts are used, while in generic C VDSO, shifts 
are generic and read from the VDSO data.

And there is still some funny code generated by GCC (8.1), like:

  620:	7d 29 3c 30 	srw     r9,r9,r7
  624:	21 87 00 20 	subfic  r12,r7,32
  628:	7d 07 3c 31 	srw.    r7,r8,r7
  62c:	7d 08 60 30 	slw     r8,r8,r12
  630:	7d 0b 4b 78 	or      r11,r8,r9
  634:	39 40 00 00 	li      r10,0
  638:	40 82 00 84 	bne     6bc <__c_kernel_clock_gettime+0x114>
  63c:	81 23 00 24 	lwz     r9,36(r3)
  640:	81 05 00 00 	lwz     r8,0(r5)
...
  6bc:	7d 69 5b 78 	mr      r9,r11
  6c0:	7c ea 3b 78 	mr      r10,r7
  6c4:	7d 2b 4b 78 	mr      r11,r9
  6c8:	4b ff ff 74 	b       63c <__c_kernel_clock_gettime+0x94>

This branch to 6bc is totally useless:
- copying r11 into r9 is pointless as r9 is overwritten in 63c
- copying back r9 into r11 is pointless as r11 has not been modified 
inbetween.
- loading r10 with 0 then overwritting r10 with r7 when r7 is not 0 is 
pointless as well, could have directly put the result of srw. in r10.

Christophe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ