lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 04 Apr 2015 15:59:11 +0200
From:	Denys Vlasenko <dvlasenk@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
	LKML <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>
Subject: sys_execve leaking rbp/rbx/r12-15 to the new process?

Hi guys,

I was looking at some optimizations in stub_execve.

In particular, not doing SAVE_EXTRA_REGS. (To recap,
SAVE_EXTRA_REGS populates pt_regs->bp/bx/r12-15).


It seems redundant, because sys_execve overwrites them via:

static int load_elf_binary(struct linux_binprm *bprm)
{
...
#ifdef ELF_PLAT_INIT
        /*
         * The ABI may specify that certain registers be set up in special
         * ways (on i386 %edx is the address of a DT_FINI function, for
         * example.  In addition, it may also specify (eg, PowerPC64 ELF)
         * that the e_entry field is the address of the function descriptor
         * for the startup routine, rather than the address of the startup
         * routine itself.  This macro performs whatever initialization to
         * the regs structure is required as well as any relocations to the
         * function descriptor entries when executing dynamically links apps.
         */
        ELF_PLAT_INIT(regs, reloc_func_desc);
#endif


where ELF_PLAT_INIT is:

#define ELF_PLAT_INIT(_r, load_addr)                    \
        elf_common_init(&current->thread, _r, 0)

static inline void elf_common_init(struct thread_struct *t,
                                   struct pt_regs *regs, const u16 ds)
{
        regs->ax = regs->bx = regs->cx = regs->dx = 0;
        regs->si = regs->di = regs->bp = 0;
        regs->r8 = regs->r9 = regs->r10 = regs->r11 = 0;
        regs->r12 = regs->r13 = regs->r14 = regs->r15 = 0;
        t->fs = t->gs = 0;
        t->fsindex = t->gsindex = 0;
        t->ds = t->es = ds;
}

But then I recalled than ELF is not all that is. Ho hum.

binfmt_flat.c has similar FLAT_PLAT_INIT, but x86 (and everyone else
except sh) doesn't define it.

binfmt_elf_fdpic.c has ELF_FDPIC_PLAT_INIT, but x86 (and most others)
doesn't define it.

I don't see any such hooks in binfmt_aout.c et al.

IOW: it looks like we do not clear these registers for any executable
types except standard ELF. We inherit their values from the prior executable.

Is this intended?

I'm asking because this inheriting of registers is not "free", we actively
make it happen:

ENTRY(stub_execve)
        CFI_STARTPROC
        addq $8, %rsp
        SAVE_EXTRA_REGS                 <====
        call sys_execve
        movq %rax,RAX(%rsp)
        RESTORE_EXTRA_REGS              <====
        jmp int_ret_from_sys_call
        CFI_ENDPROC
END(stub_execve)

which is kinda stupid if we don't actually want this to happen,
if we instead want them zeroed on success, and unchanged on failure.
Those two macros expand into twelve 5-byte instructions.

We can do this instead:

ENTRY(stub_execve)
        CFI_STARTPROC
        call	sys_execve
	testl	%eax, %eax
	jz	1f
	ret
1:      addq	$8, %rsp
	xorl	%ebx, %ebx    // maybe create a macro for zeroing these
	xorl	%ebp, %ebp    //
	xorl	%r12d, %r12d  //
	xorl	%r13d, %r13d  //
	xorl	%r14d, %r14d  //
	xorl	%r15d, %r15d  //
        movq	%rax,RAX(%rsp)  /* zero */
        jmp	int_ret_from_sys_call
        CFI_ENDPROC
END(stub_execve)

The elf_common_init() does not need to zero regs->ax/bx/bp/r12-15 anymore,
it's done by the code above.


This achieves the following:

* all executable types, not just ELF, get these regs zeroed
* error returns are faster (don't use IRET return code path)
* stores to stack are gone
* XOR's are much faster than loads from stack (and smaller too)

Any reason we should not do this change?

-- 
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ