[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d685415-2550-fc16-2675-b344a5496099@c-s.fr>
Date: Sat, 26 Oct 2019 17:54:59 +0200
From: Christophe Leroy <christophe.leroy@....fr>
To: Andy Lutomirski <luto@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Vincenzo Frascino <vincenzo.frascino@....com>,
LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.
Le 26/10/2019 à 15:55, Andy Lutomirski a écrit :
> On Tue, Oct 22, 2019 at 6:56 AM Christophe Leroy
> <christophe.leroy@....fr> wrote:
>>
>>
>>>>> The performance is rather disappoiting. That's most likely all
>>>>> calculation in the C implementation are based on 64 bits math and
>>>>> converted to 32 bits at the very end. I guess C implementation should
>>>>> use 32 bits math like the assembly VDSO does as of today.
>>>>
>>>>> gettimeofday: vdso: 750 nsec/call
>>>>>
>>>>> gettimeofday: vdso: 1533 nsec/call
>>>
>>> Small improvement (3%) with the proposed change:
>>>
>>> gettimeofday: vdso: 1485 nsec/call
>>
>> By inlining do_hres() I get the following:
>>
>> gettimeofday: vdso: 1072 nsec/call
>>
>
> A perf report might be informative.
>
Not sure there is much to learn from perf report:
With the original RFC:
51.83% test_vdso [vdso] [.] do_hres
24.86% test_vdso [vdso] [.] __c_kernel_gettimeofday
7.33% test_vdso [vdso] [.] __kernel_gettimeofday
5.77% test_vdso test_vdso [.] main
1.55% test_vdso [kernel.kallsyms] [k] copy_page
0.67% test_vdso libc-2.23.so [.] _dl_addr
0.55% test_vdso ld-2.23.so [.] do_lookup_x
With vdso_calc_delta() optimised as suggested by Thomas + inlined do_hres():
68.00% test_vdso [vdso] [.] __c_kernel_gettimeofday
12.59% test_vdso [vdso] [.] __kernel_gettimeofday
6.22% test_vdso test_vdso [.] main
2.07% test_vdso [kernel.kallsyms] [k] copy_page
1.04% test_vdso ld-2.23.so [.] _dl_relocate_object
0.89% test_vdso ld-2.23.so [.] do_lookup_x
I've tried 'perf annotate', but I have not found how to tell perf to use
vdso32.so.dbg file for annotate [vdso].
Test app:
#include <dlfcn.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/time.h>
static int (*gettimeofday_vdso)(struct timeval *tv, struct timezone *tz);
int main(int argc, char **argv)
{
void *handle = dlopen("linux-vdso32.so.1", RTLD_NOW | RTLD_GLOBAL);
struct timeval tv;
struct timezone tz;
int i;
(void)dlerror();
gettimeofday_vdso = dlsym(handle, "__kernel_gettimeofday");
if (dlerror())
gettimeofday_vdso = NULL;
for (i = 0; i < 100000; i++)
gettimeofday_vdso(&tv, &tz);
}
Christophe
Powered by blists - more mailing lists