[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140127002255.GA10323@ZenIV.linux.org.uk>
Date: Mon, 27 Jan 2014 00:22:55 +0000
From: Al Viro <viro@...IV.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Anvin <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] de-asmify the x86-64 system call slowpath
On Sun, Jan 26, 2014 at 02:28:15PM -0800, Linus Torvalds wrote:
> The x86-64 (and 32-bit, for that matter) system call slowpaths are all
> in C, but the *selection* of which slow-path to take is a mixture of
> complicated assembler ("sysret_check -> sysret_careful ->
> sysret_signal ->sysret_audit -> int_check_syscall_exit_work" etc), and
> oddly named and placed C code ("schedule_user" vs
> "__audit_syscall_exit" vs "do_notify_resume").
>
> This attached patch tries to take the "do_notify_resume()" approach,
> and renaming it to something sane ("syscall_exit_slowpath") and call
> out to *all* the different slow cases from that one place, instead of
> having some cases hardcoded in asm, and some in C. And instead of
> hardcoding which cases result in a "iretq" and which cases result in a
> faster sysret case, it's now simply a return value from that
> syscall_exit_slowpath() function, so it's very natural and easy to say
> "taking a signal will force us to do the slow iretq case, but we can
> do the task exit work and still do the sysret".
>
> I've marked this as an RFC, because I didn't bother trying to clean up
> the 32-bit code similarly (no test-cases, and trust me, if you get
> this wrong, it will fail spectacularly but in very subtle and
> hard-to-debug ways), and I also didn't bother with the slow cases in
> the "iretq" path, so that path still has the odd asm cases and calls
> the old (now legacy) do_notify_resume() path.
Umm... Can't uprobe_notify_resume() modify regs as well? While we
are at it, when we start using the same thing on 32bit kernels, we'll
need to watch out for execve() - the reason why start_thread() sets
TIF_NOTIFY_RESUME is to force us away from sysexit path. IIRC, vm86
is another thing to watch out for (same reasons).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists