lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 27 Mar 2015 15:25:47 +0100
From:	Denys Vlasenko <dvlasenk@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>,
	Andy Lutomirski <luto@...capital.net>,
	Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: ia32_sysenter_target does not preserve EFLAGS

Hi,

While running some tests I noticed that EFLAGS
is not saved across syscalls if I use 32-bit
userspace, use SYSENTER, and paravirt is active.

Looking at the code, it's actually clear why that happens.

/*
 * SYSENTER loads ss, rsp, cs, and rip from previously programmed MSRs.
 * IF and VM in rflags are cleared (IOW: interrupts are off).
 * SYSENTER does not save anything on the stack,
 * and does not save old rip (!!!) and rflags.
 */
ENTRY(ia32_sysenter_target)
        SWAPGS_UNSAFE_STACK  <============================
        movq    PER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
        ENABLE_INTERRUPTS(CLBR_NONE)

        movl    %ebp, %ebp
        movl    %eax, %eax
        movl    ASM_THREAD_INFO(TI_sysenter_return, %rsp, 0), %r10d

        /* Construct struct pt_regs on stack */
        pushq_cfi       $__USER32_DS            /* pt_regs->ss */
        pushq_cfi       %rbp                    /* pt_regs->sp */
        CFI_REL_OFFSET  rsp,0
        pushfq_cfi                              /* pt_regs->flags */

The SWAPGS_UNSAFE_STACK, it's it involves paravirt callbacks,
will change EFLAGS, and it *can't* save/restore them -
there is no place to save it, since neither stack nor
PER_CPU() is usable at that point.

Interestingly, *no one ever complained*!

Apparently, users *don't* depend on arithmetic flags
to survive over syscall. They also okay with DF flag
being cleared.

Let's go flag-by-flag.

ID - probably no one depends on it
VIP,VIF,VM - v86 stuff, not supported in 64bit
AC - someone probably do use this
RF - should be cleared to 0
NT - iret via task gate, not supported in 64bit
IOPL - usually 00, sys_iopl() can change it
DF - according to C ABI, should be 0
IF - should be preserved (but almost always 1)
TF - should be preserved
arith flags - probably no one cares

IOW. Bits to be preseved are only AC, IOPL, TF, and _maybe_
IF.

AC and IOPL are preserved even with this paravirt quirk
because paravirt hooks do not mangle them.

TF preservation and proper restoration is handled by
	do_debug + syscall_trace_enter_phase2 + iret
combo.

We unconditionally set IF. This is only a problem for applications
which use sys_iopl(3) and, disable IRQs in userspace and perform
syscalls. The set of such apps is probably empty.
(This "bug" exists even for non-paravirt case).

So, formally, we have a bug: we do not preserve IF,
DF and arith flags.

I'm proposing to use this opportunity to amend syscall ABI
to say that arith flags are not preserved across syscalls,
and DF can be cleared to 0 by syscalls (but can't be set to 1).
Evidently, it's broken for some time for some virtualized
setups and users are okay.

I'm not sure what to do with the "bug" of forcing IF=1.
Fix it? Or also declare that syscalls can set IF=1?
Do you think this is a legitimate userspace code?

	sys_iopl(3);
	cli;
	syscall();
	/* expects irqs still disabled */

-- 
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ