linux-kernel - Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y1wotayr.ffs@tglx>
Date:   Wed, 20 Jul 2022 11:00:44 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Peter Zijlstra <peterz@...radead.org>,
        Sami Tolvanen <samitolvanen@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Andrew Cooper <Andrew.Cooper3@...rix.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Johannes Wikner <kwikner@...z.ch>,
        Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
        Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
        Joao Moreira <joao.moreira@...el.com>,
        Joseph Nuzman <joseph.nuzman@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juergen Gross <jgross@...e.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

On Tue, Jul 19 2022 at 01:51, Peter Zijlstra wrote:
> On Mon, Jul 18, 2022 at 03:48:04PM -0700, Sami Tolvanen wrote:
>> On Mon, Jul 18, 2022 at 2:18 PM Peter Zijlstra <peterz@...radead.org> wrote:
>> > Ofc, we can still put the whole:
>> >
>> >         sarq    $5, PER_CPU_VAR(__x86_call_depth);
>> >         jmp     \func_direct
>> >
>> > thing in front of that.
>> 
>> Sure, that would work.
>
> So if we assume \func starts with ENDBR, and further assume we've fixed
> up every direct jmp/call to land at +4, we can overwrite the ENDBR with
> part of the SARQ, that leaves us 6 more byte, placing the immediate at
> -10 if I'm not mis-counting.
>
> Now, the call sites are:
>
> 41 81 7b fa 78 56 34 12		cmpl	$0x12345678, -6(%r11)
> 74 02				je	1f
> 0f 0b				ud2
> e8 00 00 00 00		1:	call	__x86_indirect_thunk_r11
>
> That means the offset of +10 lands in the middle of the CALL
> instruction, and since we only have 16 thunks there is a limited number
> of byte patterns available there.
>
> This really isn't as nice as the -6 but might just work well enough,
> hmm?

So I added a 32byte padding and put the thunk at the start:

        sarq    $5, PER_CPU_VAR(__x86_call_depth);
        jmp     \func_direct

For sockperf that costs about 1% performance vs. the 16 byte
variant. For mitigations=off it's a ~0.5% drop.

That's on a SKL. Did not check on other systems yet.

Thanks,

        tglx