[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <55DC277202000078000D7272@prv-mh.provo.novell.com>
Date: Tue, 25 Aug 2015 01:29:38 -0600
From: "Jan Beulich" <jbeulich@...e.com>
To: <luto@...capital.net>
Cc: <bp@...en8.de>, <brgerst@...il.com>, <x86@...nel.org>,
<torvalds@...ux-foundation.org>, <dvlasenk@...hat.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: Proposal for finishing the 64-bit x86 syscall cleanup
>>> Andy Lutomirski <luto@...capital.net> 08/24/15 11:14 PM >>>
>Thing 1: partial pt_regs
>
>64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and
>r12-r15 are uninitialized. Some syscalls require them to be
>initialized, and they have special awful stubs to do it. The entry
>and exit tracing code (except for phase1 tracing) also need them
>initialized, and they have their own messy initialization. Compat
>syscalls are their own private little mess here.
>
>This gets in the way of all kinds of cleanups, because C code can't
>switch between the full and partial pt_regs states.
>
>I can see two ways out. We could remove the optimization entirely,
>which consists of pushing and popping six more registers and adds
>about ten cycles to fast path syscalls on Sandy Bridge. It also
>simplifies and presumably speeds up the slow paths.
>
>We could also annotate with syscalls need full regs and jump to the
>slow path for them. This would leave the fast path unchanged (we
>could duplicate the sys call table so that regs-requiring syscalls
>would turn into some asm that switches to the slow path). We'd make
>the syscall table say something like:
>
>59 64 execve sys_execve:regs
>
>The fast path would have exactly identical performance and the slow
>path would presumably speed up. The down side would be additional
>complexity.
Namely - would this be any better than the current, "special awful" stubs?
>Thing 2: vdso compilation with binutils that doesn't support .cfi directives
>
>Userspace debuggers really like having the vdso properly
>CFI-annotated, and the 32-bit fast syscall entries are annotatied
>manually in hexidecimal. AFAIK Jan Beulich is the only person who
>understands it.
>
>I want to be able to change the entries a little bit to clean them up
>(and possibly rework the SYSCALL32 and SYSENTER register tricks, which
>currently suck), but it's really, really messy right now because of
>the hex CFI stuff. Could we just drop the CFI annotations if the
>binutils version is too old or even just require new enough binutils
>to build 32-bit and compat kernels?
I think that's a reasonable thing - iirc the oldest binutils I'm building with
(SLE10 i.e. 2.16.91-ish) support them, and I'd suppose the equally old
RHEL's binutils do too. Not sure if there are any other long maintained
distros that might carry even older binutils.
Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists