lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 26 Oct 2019 18:06:52 +0200
From:   Christophe Leroy <christophe.leroy@....fr>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        vincenzo.frascino@....com, luto@...nel.org,
        linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.



Le 26/10/2019 à 17:53, Thomas Gleixner a écrit :
> On Tue, 22 Oct 2019, Christophe Leroy wrote:
>> Le 22/10/2019 à 11:01, Christophe Leroy a écrit :
>>> Le 21/10/2019 à 23:29, Thomas Gleixner a écrit :
>>>> On Mon, 21 Oct 2019, Christophe Leroy wrote:
>>>>
>>>>> This is a tentative to switch powerpc/32 vdso to generic C
>>>>> implementation.
>>>>> It will likely not work on 64 bits or even build properly at the moment.
>>>>>
>>>>> powerpc is a bit special for VDSO as well as system calls in the
>>>>> way that it requires setting CR SO bit which cannot be done in C.
>>>>> Therefore, entry/exit and fallback needs to be performed in ASM.
>>>>>
>>>>> To allow that, C fallbacks just return -1 and the ASM entry point
>>>>> performs the system call when the C function returns -1.
>>>>>
>>>>> The performance is rather disappoiting. That's most likely all
>>>>> calculation in the C implementation are based on 64 bits math and
>>>>> converted to 32 bits at the very end. I guess C implementation should
>>>>> use 32 bits math like the assembly VDSO does as of today.
>>>>
>>>>> gettimeofday:    vdso: 750 nsec/call
>>>>>
>>>>> gettimeofday:    vdso: 1533 nsec/call
>>>
>>> Small improvement (3%) with the proposed change:
>>>
>>> gettimeofday:    vdso: 1485 nsec/call
>>
>> By inlining do_hres() I get the following:
>>
>> gettimeofday:    vdso: 1072 nsec/call
> 
> What's the effect for clock_gettime()?
> 
> gettimeofday() is suboptimal vs. the PPC ASM variant due to an extra
> division, but clock_gettime() should be 1:1 comparable.
> 

Original PPC asm:
clock-gettime-realtime:    vdso: 928 nsec/call

My original RFC:
clock-gettime-realtime:    vdso: 1570 nsec/call

With your suggested vdso_calc_delta():
clock-gettime-realtime:    vdso: 1512 nsec/call

With your vdso_calc_delta() and inlined do_hres():
clock-gettime-realtime:    vdso: 1302 nsec/call

Christophe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ