[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f4486e86-3c0c-0eec-1639-0e5956cdb8f1@c-s.fr>
Date: Tue, 22 Oct 2019 11:01:45 +0200
From: Christophe Leroy <christophe.leroy@....fr>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
vincenzo.frascino@....com, luto@...nel.org,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.
Le 21/10/2019 à 23:29, Thomas Gleixner a écrit :
> On Mon, 21 Oct 2019, Christophe Leroy wrote:
>
>> This is a tentative to switch powerpc/32 vdso to generic C implementation.
>> It will likely not work on 64 bits or even build properly at the moment.
>>
>> powerpc is a bit special for VDSO as well as system calls in the
>> way that it requires setting CR SO bit which cannot be done in C.
>> Therefore, entry/exit and fallback needs to be performed in ASM.
>>
>> To allow that, C fallbacks just return -1 and the ASM entry point
>> performs the system call when the C function returns -1.
>>
>> The performance is rather disappoiting. That's most likely all
>> calculation in the C implementation are based on 64 bits math and
>> converted to 32 bits at the very end. I guess C implementation should
>> use 32 bits math like the assembly VDSO does as of today.
>
>> gettimeofday: vdso: 750 nsec/call
>>
>> gettimeofday: vdso: 1533 nsec/call
Small improvement (3%) with the proposed change:
gettimeofday: vdso: 1485 nsec/call
Though still some way to go.
Christophe
>
> The only real 64bit math which can matter is the 64bit * 32bit multiply,
> i.e.
>
> static __always_inline
> u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
> {
> return ((cycles - last) & mask) * mult;
> }
>
> Everything else is trivial add/sub/shift, which should be roughly the same
> in ASM.
>
> Can you try to replace that with:
>
> static __always_inline
> u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
> {
> u64 ret, delta = ((cycles - last) & mask);
> u32 dh, dl;
>
> dl = delta;
> dh = delta >> 32;
>
> res = mul_u32_u32(al, mul);
> if (ah)
> res += mul_u32_u32(ah, mul) << 32;
>
> return res;
> }
>
> That's pretty much what __do_get_tspec does in ASM.
>
> Thanks,
>
> tglx
>
Powered by blists - more mailing lists