linux-kernel - Re: FSGSBASE ABI considerations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUnMqtBNrUm8DX3V_sD81BYEOVjxG7Pr2yw6PEJmM5iZg@mail.gmail.com>
Date:   Mon, 7 Aug 2017 12:07:45 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Andy Lutomirski <luto@...nel.org>, Stas Sergeev <stsp@...t.ru>,
        "Bae, Chang Seok" <chang.seok.bae@...el.com>,
        X86 ML <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Borislav Petkov <bpetkov@...e.de>,
        Brian Gerst <brgerst@...il.com>,
        Bart Oldeman <bartoldeman@...rs.sourceforge.net>
Subject: Re: FSGSBASE ABI considerations

On Mon, Aug 7, 2017 at 10:35 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Mon, Aug 7, 2017 at 9:20 AM, Andy Lutomirski <luto@...nel.org> wrote:
>>
>> Windows does something sort of like this (I think), but I don't like
>> this solution.  I fully expect that someone will write a program that
>> does:
>>
>> old = rdgsbase();
>> wrgsbase(new);
>> call_very_fast_function();
>> wrgsbase(old);
>>
>> This will work if GS == 0, which is fine.  The problem is that it will
>> *also* work if GS != 0 with very high probability, especially if this
>> code sequence is right after some operation that sleeps.  And then
>> we'll get random crashes with very low probability, depending on where
>> the scheduler hits.
>
> It will work reliably if you just make the scheduler save/restore the
> base rather than the selector.
>
> I really think you need to walk away from the "selector is meaningful"
> model. Yes, yes, it's the legacy model, but it's the *insane* model.
>
> So screw the selector. It doesn't matter. We'll need to save/restore
> the value, but that's it. What we *really* save and restore is just
> the base pointer.
>
> Why do you care so much about the selector? If people *don't* use the
> fsgsbase, then the selector and the base of the segment will always
> match anyway (modulo the system calls that actually change the
> gdt/ldt, and we can just sat that *then* selectors matter).
>
> And if people *do* use fsgsbase, then the selector is by definition
> not important.
>
> So just make the scheduler save the base first, and restore it last.
> End of problem. Your user-space code above just works. There is no
> race, i doesn't matter one whit whether GS is 0 ir not, there simply
> is no problem.

I agree completely.  The scheduler should do exactly this and, with my
patches applied, it does.

>
> So just what is the problem you're trying to solve?
>

I'm trying to avoid a situation where we implement that policy and the
interaction with modify_ldt() becomes very strange.  Linux has a long
history of having ill-defined semantics x86_64, and I don't want to
make it worse.

If we *just* change the way the scheduler works, then we end up with
modify_ldt() behaving determinstically on IVB+ and behaving
deterministically on 32-bit kernels, but having that deterministic
behavior be *different*.  This makes me rather unhappy about the whole
situation.

Also, I don't want to break gdb, and even telling whether a change
breaks gdb is an incredible PITA.  Whern GDB saves and restores a
context, it currently restores the base first and the selector second,
and I have no idea whether gdb expects restoring the selector to
update the base.