[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWP41MRGVT7oYoP5_f3jdh5nEvKmDrYuHWwsRgUZCGf=w@mail.gmail.com>
Date: Tue, 14 Aug 2018 07:20:20 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Matt Rickard <matt@...trans.com.au>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Woodhouse <dwmw2@...radead.org>, X86 ML <x86@...nel.org>,
Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO
[Added a whole bunch of ccs]
On Mon, Aug 13, 2018 at 6:17 PM, Matt Rickard <matt@...trans.com.au> wrote:
> Process clock_gettime(CLOCK_TAI) in VDSO. This makes the call about as fast as
> CLOCK_REALTIME instead of taking about four times as long.
>
> Signed-off-by: Matt Rickard <matt@...trans.com.au>
> ---
> arch/x86/entry/vdso/vclock_gettime.c | 30 ++++++++++++++++++++++++++++++
> arch/x86/entry/vsyscall/vsyscall_gtod.c | 2 ++
> arch/x86/include/asm/vgtod.h | 1 +
> 3 files changed, 33 insertions(+)
>
> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
> index f19856d95c60..bc8d8f086721 100644
> --- a/arch/x86/entry/vdso/vclock_gettime.c
> +++ b/arch/x86/entry/vdso/vclock_gettime.c
...
> notrace static void do_realtime_coarse(struct timespec *ts)
> {
> unsigned long seq;
> @@ -284,8 +305,17 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
> do_monotonic_coarse(ts);
> break;
> default:
> + /* Doubled switch statement to work around kernel Makefile error */
> + /* See: https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg567499.html */
NAK.
The issue here (after reading that thread) is that, with our current
compile options, gcc generates a jump table once the switch statement
hits five entries. And it uses retpolines for it, and somehow it
generates the relocations in such a way that the vDSO build fails. We
need to address this so that the vDSO build is reliable, but there's
an important question here:
Should the vDSO be built with retpolines, or should it be built with
indirect branches? Or should we go out of our way to make sure that
the vDSO contains neither retpolines nor indirect branches?
We could accomplish the latter (sort of) by manually converting the
switch into the appropriate if statements, but that's rather ugly.
(Hmm. We should add exports to directly read each clock source.
They'll be noticeably faster, especially when
cache-and-predictor-code.)
Powered by blists - more mailing lists