[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1QfynDxkKBLXVa5Qj_z+7ct3QB5sDYuY5QV3ht-a0cTg@mail.gmail.com>
Date: Fri, 14 Sep 2018 16:22:44 +0200
From: Arnd Bergmann <arnd@...db.de>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andy Lutomirski <luto@...nel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, matt@...trans.com.au,
Stephen Boyd <sboyd@...nel.org>,
John Stultz <john.stultz@...aro.org>,
Florian Weimer <fweimer@...hat.com>,
"K. Y. Srinivasan" <kys@...rosoft.com>, vkuznets@...hat.com,
devel@...uxdriverproject.org,
virtualization@...ts.linux-foundation.org,
Paolo Bonzini <pbonzini@...hat.com>,
Juergen Gross <jgross@...e.com>,
Deepa Dinamani <deepa.kernel@...il.com>
Subject: Re: [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support
On Fri, Sep 14, 2018 at 2:52 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Matt attempted to add CLOCK_TAI support to the VDSO clock_gettime()
> implementation, which extended the clockid switch case and added yet
> another slightly different copy of the same code.
>
> Especially the extended switch case is problematic as the compiler tends to
> generate a jump table which then requires to use retpolines. If jump tables
> are disabled it adds yet another conditional to the existing maze.
>
> This series takes a different approach by consolidating the almost
> identical functions into one implementation for high resolution clocks and
> one for the coarse grained clock ids by storing the base data for each
> clock id in an array which is indexed by the clock id.
>
> This completely eliminates the switch case and allows further
> simplifications of the code base, which at the end all together gain a few
> cycles performance or at least stay on par with todays code. The resulting
> performance depends heavily on the micro architecture and the compiler.
No objections from my side, just a few general remarks: Deepa
and I have discussed the vdso in the past for 2038. I have started
a patch that I'll have to redo on top of yours, but that is fine, your
cleanup is only going to help here.
More generally speaking, Deepa said that she would like to see
some generalization on the vdso before adding the time64_t
based variants. Your series probably makes x86 diverge more
from the others, which makes it harder to consolidate them again,
but we might just as well use your new implementation to base the
generic one on, and just move the other ones over to that.
A couple of architectures (s390, ia64, riscv, powerpc, arm64)
implement the vdso as assembler code at the moment, so they
won't be as easy to consolidate (other than outright replacing all
the code).
The other five:
arch/x86/entry/vdso/vclock_gettime.c
arch/sparc/vdso/vclock_gettime.c
arch/nds32/kernel/vdso/gettimeofday.c
arch/mips/vdso/gettimeofday.c
arch/arm/vdso/vgettimeofday.c
are basically all minor variations of the same code base and could be
consolidated to some degree.
Any suggestions here? Should we plan to do that consolitdation based on
your new version, or just add clock_gettime64 in arm32 and x86-32, and then
be done with it? The other ones will obviously still be fast for 32-bit time_t
and will have a working non-vdso sys_clock_getttime64().
I also wonder about clock_getres(): half the architectures seem to implement
it in vdso, but notably arm32 and x86 don't, and I had not expected it to be
performance critical given that the result is easily cached in user space.
Arnd
Powered by blists - more mailing lists