linux-kernel - Re: Virt Call depth tracking mitigation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <261e141a-7e7f-ce26-60fe-df1957e393df@citrix.com>
Date:   Tue, 19 Jul 2022 16:23:30 +0000
From:   Andrew Cooper <Andrew.Cooper3@...rix.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
CC:     "x86@...nel.org" <x86@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Johannes Wikner <kwikner@...z.ch>,
        Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
        Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
        Joao Moreira <joao.moreira@...el.com>,
        Joseph Nuzman <joseph.nuzman@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juergen Gross <jgross@...e.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        "kys@...rosoft.com" <kys@...rosoft.com>,
        "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>,
        "decui@...rosoft.com" <decui@...rosoft.com>,
        Michael Kelley <mikelley@...rosoft.com>
Subject: Re: Virt Call depth tracking mitigation

On 19/07/2022 15:13, Thomas Gleixner wrote:
> On Tue, Jul 19 2022 at 10:24, Andrew Cooper wrote:
>> On 17/07/2022 00:17, Thomas Gleixner wrote:
>>> As IBRS is a performance horror show, Peter Zijstra and me revisited the
>>> call depth tracking approach and implemented it in a way which is hopefully
>>> more palatable and avoids the downsides of the original attempt.
>>>
>>> We both unsurprisingly hate the result with a passion...
>> And I hate to add more problems, but here we go.
>>
>> Under virt, it's not just SMI's which might run behind your back. 
>> Regular interrupts/etc can probably be hand-waved away in the same way
>> that SMIs are.
> You mean host side interrupts, right?

Yes.

>
>> Hypercalls however are a different matter.
>>
>> Xen and HyperV both have hypercall pages, where the hypervisor provides
>> some executable code for the guest kernel to use.
>>
>> Under the current scheme, the calls into the hypercall pages get
>> accounted, as objtool can see them, but the ret's don't.  This imbalance
>> is exasperated because some hypercalls are called in loops.
> Bah.
>
>> Worse however, it opens a hole where branch history is calculable and
>> the ret can reliably underflow.  This occurs when there's a minimal call
>> depth in Linux to get to the hypercall, and then a call depth of >16 in
>> the hypervisor.
>>
>> The only variable in these cases is how much user control there is of
>> the registers, and I for one am not feeling lucky in face of the current
>> research.
>>
>> The only solution I see here is for Linux to ret-thunk the hypercall
>> page too.  Under Xen, the hypercall page is mutable by the guest and
>> there is room to turn every ret into a jmp, but obviously none of this
>> is covered by any formal ABI, and this probably needs more careful
>> consideration than the short time I've put towards it.
> Well, that makes the guest side "safe", but isn't a deep hypercall > 16
> already underflowing in the hypervisor code before it returns to the
> guest?

Yeah, but that's the hypervisor's problem to deal with, in whatever
manner it sees fit.

And if the hypervisor is using IBeeRS then the first ret in guest
context will underflow.

>> That said, after a return from the hypervisor, Linux has no idea what
>> state the RSB is in, so the only safe course of action is to re-stuff.
> Indeed.
>
> Another proof for my claim that virt creates more problems than it
> solves.

So how did you like debugging the gsbase crash on native hardware. :)

~Andrew