linux-kernel - Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <341a47de-2c12-43a3-2d5d-d9727a1e7420@citrix.com>
Date:   Wed, 9 Aug 2023 11:04:15 +0100
From:   Andrew.Cooper3@...rix.com
To:     Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Cc:     linux-kernel@...r.kernel.org, David.Kaplan@....com,
        jpoimboe@...nel.org, gregkh@...uxfoundation.org
Subject: Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches

On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
>
> Anyway, this here overhauls the SRSO patches in a big way.
>
> I claim that AMD retbleed (also called Speculative-Type-Confusion

Branch Type Confusion.

Speculative Type Confusion is something else; generally Spectre v1 or v2
around a logical type check, usually ending up confusing pointers and
integer.

It appears that you might be suffering from Type-of-Speculative-Bug
Confusion, an affliction brought on by the chronic lack of documentation
and consistency, the fact that almost everything has at least 2 names,
and that 6 years in this horror show it's not showing any sign of
slowing down.

>  -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.

BTC and SRSO are certainly related, but they're not the same.

With BTC, an attacker poisons a branch type prediction to say "that
thing (which isn't actually a ret) is a ret".

With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
Later, a real function (that is architecturally correct execution and
will retire) trips over the predicted infinite loop, which overflows the
RSB/RAS/RAP replacing the correct prediction on the top with the
attackers choice of value.

So while branch type confusion is used to poison the top-of-RSB value,
the ret that actually goes wrong needs a correct type=ret prediction for
the SRSO attack to succeed.

Both issues can be mitigated with IBPB-on-entry (given up-to-date
microcode in some cases).

Both issues have a software sequence that tries to make the contents of
a __x86_return_thunk sequence safe to use.  For BTC, it's simply a case
of ensuring the type prediction of the one ret is good.  For SRSO, it's
something more complicated and I don't know the uarch details fully.

~Andrew