lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Aug 2023 00:01:29 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Josh Poimboeuf <jpoimboe@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Borislav Petkov <bp@...en8.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Babu Moger <babu.moger@....com>, David.Kaplan@....com,
        Nikolay Borisov <nik.borisov@...e.com>,
        gregkh@...uxfoundation.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH RFC 4/4] x86/srso: Use CALL-based return thunks to reduce
 overhead

On 21/08/2023 4:16 pm, Josh Poimboeuf wrote:
> On Mon, Aug 21, 2023 at 12:27:23PM +0100, Andrew Cooper wrote:
>> The SRSO safety depends on having a CALL to an {ADD,LEA}/RET sequence which
>> has been made safe in the BTB.  Specifically, there needs to be no pertubance
>> to the RAS between a correctly predicted CALL and the subsequent RET.
>>
>> Use the new infrastructure to CALL to a return thunk.  Remove
>> srso_fam1?_safe_ret() symbols and point srso_fam1?_return_thunk().
>>
>> This removes one taken branch from every function return, which will reduce
>> the overhead of the mitigation.  It also removes one of three moving pieces
>> from the SRSO mess.
> So, the address of whatever instruction comes after the 'CALL
> srso_*_return_thunk' is added to the RSB/RAS, and that might be
> speculated to when the thunk returns.  Is that a concern?

That is very intentional, and key to the safety.

Replacing a RET with a CALL/{ADD,LEA}/RET sequence is a form of
retpoline thunk.  The only difference with regular retpolines is that
the intended target is already on the stack, and not in a GPR.


If the CALL mispredicts, it doesn't matter.  When decode catches up
(allegedly either instantaneously on Fam19h, or a few cycles late on
Fam17h), the top of the RAS is corrected will point at the INT3
following the CALL instruction.

When the CALL is corrected, speculation continues at the real
destination (the {ADD,LEA}/RET sequence) where the {ADD,LEA} pops the
"wrong" return address off the stack and lets the RET take the next
address up the stack.

The RET predicts to INT3 following the call (which is safe), and
eventually gets corrected to the parent return address on the stack
which is the real intended destination.

Therefore, rogue RET speculation is always safely contained at the INT3
until the RET uop can execute, notice the mispredict, and correct to
what the stack says is correct.

Of course, relying on the fact that the {ADD,LEA}+RET sequence doesn't
have poison in the BTB, which is what the UNTRAIN_RET sequence is trying
to achieve with microarchitectural means.

>> Signed-off-by: Andrew Cooper <andrew.cooper3@...rix.com>
>> ---
>> CC: x86@...nel.org
>> CC: linux-kernel@...r.kernel.org
>> CC: Borislav Petkov <bp@...en8.de>
>> CC: Peter Zijlstra <peterz@...radead.org>
>> CC: Josh Poimboeuf <jpoimboe@...nel.org>
>> CC: Babu Moger <babu.moger@....com>
>> CC: David.Kaplan@....com
>> CC: Nikolay Borisov <nik.borisov@...e.com>
>> CC: gregkh@...uxfoundation.org
>> CC: Thomas Gleixner <tglx@...utronix.de>
>>
>> RFC:
>>
>>   vmlinux.o: warning: objtool: srso_fam17_return_thunk(): can't find starting instruction
>>
>> Any objtool whisperers know what's going on, and particularly why
>> srso_fam19_return_thunk() appears to be happy?
>>
>> Also, depends on the resolution of the RFC in the previous patch.
> I can take a look.

Thanks.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ