lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d685415-2550-fc16-2675-b344a5496099@c-s.fr>
Date:   Sat, 26 Oct 2019 17:54:59 +0200
From:   Christophe Leroy <christophe.leroy@....fr>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.



Le 26/10/2019 à 15:55, Andy Lutomirski a écrit :
> On Tue, Oct 22, 2019 at 6:56 AM Christophe Leroy
> <christophe.leroy@....fr> wrote:
>>
>>
>>>>> The performance is rather disappoiting. That's most likely all
>>>>> calculation in the C implementation are based on 64 bits math and
>>>>> converted to 32 bits at the very end. I guess C implementation should
>>>>> use 32 bits math like the assembly VDSO does as of today.
>>>>
>>>>> gettimeofday:    vdso: 750 nsec/call
>>>>>
>>>>> gettimeofday:    vdso: 1533 nsec/call
>>>
>>> Small improvement (3%) with the proposed change:
>>>
>>> gettimeofday:    vdso: 1485 nsec/call
>>
>> By inlining do_hres() I get the following:
>>
>> gettimeofday:    vdso: 1072 nsec/call
>>
> 
> A perf report might be informative.
> 

Not sure there is much to learn from perf report:

With the original RFC:

     51.83%  test_vdso  [vdso]             [.] do_hres
     24.86%  test_vdso  [vdso]             [.] __c_kernel_gettimeofday
      7.33%  test_vdso  [vdso]             [.] __kernel_gettimeofday
      5.77%  test_vdso  test_vdso          [.] main
      1.55%  test_vdso  [kernel.kallsyms]  [k] copy_page
      0.67%  test_vdso  libc-2.23.so       [.] _dl_addr
      0.55%  test_vdso  ld-2.23.so         [.] do_lookup_x

With vdso_calc_delta() optimised as suggested by Thomas + inlined do_hres():

     68.00%  test_vdso  [vdso]             [.] __c_kernel_gettimeofday
     12.59%  test_vdso  [vdso]             [.] __kernel_gettimeofday
      6.22%  test_vdso  test_vdso          [.] main
      2.07%  test_vdso  [kernel.kallsyms]  [k] copy_page
      1.04%  test_vdso  ld-2.23.so         [.] _dl_relocate_object
      0.89%  test_vdso  ld-2.23.so         [.] do_lookup_x

I've tried 'perf annotate', but I have not found how to tell perf to use 
vdso32.so.dbg file for annotate [vdso].

Test app:

#include <dlfcn.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/time.h>

static int (*gettimeofday_vdso)(struct timeval *tv, struct timezone *tz);

int main(int argc, char **argv)
{
	void *handle = dlopen("linux-vdso32.so.1", RTLD_NOW | RTLD_GLOBAL);
	struct timeval tv;
	struct timezone tz;
	int i;

	(void)dlerror();

	gettimeofday_vdso = dlsym(handle, "__kernel_gettimeofday");
	if (dlerror())
		gettimeofday_vdso = NULL;

	for (i = 0; i < 100000; i++)
		gettimeofday_vdso(&tv, &tz);
}


Christophe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ