linux-kernel - Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrX_RCGH0C4GJ0Zxm1emcasMbZ5s29HCfFy=GrEC9w28+A@mail.gmail.com>
Date:	Thu, 23 Apr 2015 09:27:03 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Ingo Molnar <mingo@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Borislav Petkov <bp@...en8.de>,
	"H. Peter Anvin" <hpa@...or.com>, Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Will Drewry <wad@...omium.org>,
	Kees Cook <keescook@...omium.org>,
	"the arch/x86 maintainers" <x86@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

On Thu, Apr 23, 2015 at 9:13 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Thu, Apr 23, 2015 at 9:06 AM, Brian Gerst <brgerst@...il.com> wrote:
>>
>> So you are saying we should save and conditionally restore the
>> kernel's %ss during context switch?  That shouldn't be too bad.  Half
>> of the time you would be loading the null selector which is fast (no
>> GDT access, no validation).
>
> I'd almost prefer something along those lines, yes. Who knows *what*
> leaks? If the present bit state leaks, then likely so does the limit
> value etc etc..
>

I'll go out on a limb and guess the present bit doesn't leak.  If I
were implementing an x86 cpu, I wouldn't have a present bit at all in
the descriptor cache, since you aren't supposed to be able to load a
non-present descriptor in the first place.  I bet it's the limit we're
seeing.

But I think I prefer something closer to Denys' approach with
alternatives instead.  I think the only case that matters (if my
hare-brained explanantion of the actual crash is right) is when we
sysret (q or l) while SS is 0.  That only happens if we scheduled
inside a syscall, and I'm guessing that testing if ss is zero and
reloading it on syscall return will be a smaller performance hit than
reloading on all context switches.  The latter could happen more than
once per syscall, and it could also affect tasks that aren't doing
syscalls at all and are therefore unaffected.

I'll try to send out a patch and a test case later today, but no
promises -- the test case will be a bit tedious, and I'm already
overcommitted for today :(

A sketch of the a reproducer:

Two threads.  Thread 1 sets ss to some very-low-limit value, and it
loops doing mov $-1, %eax; int $80.  Thread 2 is ordinary 32-bit code
doing while(true) usleep(1);

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/