lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVt5AnrdHrj=HcWrLRHT7TN1ZoFb3cqj0VoktCn4c0W6A@mail.gmail.com>
Date:	Mon, 9 Mar 2015 11:04:55 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
	Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Will Drewry <wad@...omium.org>,
	Kees Cook <keescook@...omium.org>,
	"the arch/x86 maintainers" <x86@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86: entry_32.S: change ESPFIX test to not touch PT_OLDSS(%esp)

On Mon, Mar 9, 2015 at 10:59 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Mon, Mar 9, 2015 at 10:45 AM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> If sp0 is set to the very top of the stack, then an NMI immediately
>> after sysenter will have OLDSS off the top of the stack, and reading
>> it can crash.  This is why 32-bit kernels have a (buggy!) 8 byte
>> offset in sp0.
>
> So I think that for sysenter, we *should* have that 8-byte buffer.
>
> Not in general for sp0, but for MSR_IA32_SYSENTER_ESP (which is sp1, afaik).

Bah.  I'm slightly wrong here.  It's not NMI immediately after
sysenter (that's handled by the emergency stack gunk, although we
might need to fix that too).  It's this:

ENTRY(ia32_sysenter_target)
    CFI_STARTPROC simple
    CFI_SIGNAL_FRAME
    CFI_DEF_CFA esp, 0
    CFI_REGISTER esp, ebp
    movl TSS_sysenter_sp0(%esp),%esp
sysenter_past_esp:
    /*
     * Interrupts are disabled here, but we can't trace it until
     * enough kernel state to call TRACE_IRQS_OFF can be called - but
     * we immediately enable interrupts at that point anyway.
     */

--> NMI here

    pushl_cfi $__USER_DS

--> or here

    /*CFI_REL_OFFSET ss, 0*/
    pushl_cfi %ebp

We could fix that locally by loading sp0 - 8 atomically (w/ percpu
help) and populating the top two words with mov instead of push.

One option would be to change the NMI entry code to move itself down 8
bytes if this happens (came from kernel mode or sp == sp0 - 12,
perhaps).  Or we could suck it up and keep that 8 byte offset (and fix
the cpu_tss initializer bug that threw me off in the first place).

--Andy

>
> Just make the rule be that you can never ever have a kernel stack
> frame that doesn't contain room for ss/sp at the top.
>
> We have various code that looks at and touches "pt_regs" anyway, and
> accesses things out for debugging/oopsing/tracing etc. Let's not make
> the rule be that you cannot look at regs->ss without checking various
> random other fields first.
>
>                          Linus



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ