[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAObL_7GaPDNops_cL2u4yU9Afj14f6ju_ZW7qRjhSy25qsbfkw@mail.gmail.com>
Date: Fri, 13 Jul 2012 16:00:04 -0700
From: Andrew Lutomirski <luto@....edu>
To: Will Drewry <wad@...omium.org>
Cc: linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
qmewlo@...il.com, eparis@...hat.com, keescook@...omium.org,
james.l.morris@...cle.com, hpa@...or.com, cevans@...omium.org
Subject: Re: [PATCH v2] x86/vsyscall: allow seccomp filter in vsyscall=emulate
On Fri, Jul 13, 2012 at 10:06 AM, Will Drewry <wad@...omium.org> wrote:
> If a seccomp filter program is installed, older static binaries and
> distributions with older libc implementations (glibc 2.13 and earlier)
> that rely on vsyscall use will be terminated regardless of the filter
> program policy when executing time, gettimeofday, or getcpu. This is
> only the case when vsyscall emulation is in use (vsyscall=emulate is the
> default).
>
> This patch emulates system call entry inside a vsyscall=emulate by
> populating regs->ax and regs->orig_ax with the system call number prior
> to calling into seccomp such that all seccomp-dependencies function
> normally. Additionally, system call return behavior is emulated in line
> with other vsyscall entrypoints for the trace/trap cases.
>
> Reported-by: Owen Kibel <qmewlo@...il.com>
> Signed-off-by: Will Drewry <wad@...omium.org>
>
> v2: - fixed ip and sp on SECCOMP_RET_TRAP/TRACE (thanks to luto@....edu)
> @@ -253,6 +273,12 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
>
> current_thread_info()->sig_on_uaccess_error = prev_sig_on_uaccess_error;
>
> + if (skip) {
> + if ((long)regs->ax <= 0L) /* seccomp errno emulation */
> + goto do_ret;
> + goto done; /* seccomp trace/trap */
> + }
> +
> if (ret == -EFAULT) {
> /* Bad news -- userspace fed a bad pointer to a vsyscall. */
> warn_bad_vsyscall(KERN_INFO, regs,
> @@ -271,10 +297,11 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
>
> regs->ax = ret;
>
> +do_ret:
> /* Emulate a ret instruction. */
> regs->ip = caller;
> regs->sp += 8;
> -
> +done:
> return true;
>
> sigsegv:
> --
> 1.7.9.5
>
This has the same odd property as the sigsegv path that the faulting
instruction will appear to be the mov, not the syscall. That seems to
be okay, though -- various pieces of code that try to restart the segv
are okay with that.
Is there any code that assumes that changing rax (i.e. the syscall
number) and restarting a syscall after SIGSYS will invoke the new
syscall? (The RET_TRACE path might be similar -- does the
ptrace_event(PTRACE_EVENT_SECCOMP, data) in seccomp.c give a debugger
a chance to synchronously cancel or change the syscall?
If those issues aren't problems, then:
Reviewed-by: Andy Lutomirski <luto@...capital.net>
(If the syscall number needs to change after the fact in the
SECCOMP_RET_TRAP case, it'll be a mess.)
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists