linux-kernel - RE: Virt Call depth tracking mitigation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <PH0PR21MB30250D66D36B5D94C809BE31D78F9@PH0PR21MB3025.namprd21.prod.outlook.com>
Date:   Tue, 19 Jul 2022 14:45:40 +0000
From:   "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To:     Andrew Cooper <Andrew.Cooper3@...rix.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
CC:     "x86@...nel.org" <x86@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Johannes Wikner <kwikner@...z.ch>,
        Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
        Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
        Joao Moreira <joao.moreira@...el.com>,
        Joseph Nuzman <joseph.nuzman@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juergen Gross <jgross@...e.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>
Subject: RE: Virt Call depth tracking mitigation

From: Andrew Cooper <Andrew.Cooper3@...rix.com> Sent: Tuesday, July 19, 2022 3:25 AM
> 
> On 17/07/2022 00:17, Thomas Gleixner wrote:
> > As IBRS is a performance horror show, Peter Zijstra and me revisited the
> > call depth tracking approach and implemented it in a way which is hopefully
> > more palatable and avoids the downsides of the original attempt.
> >
> > We both unsurprisingly hate the result with a passion...
> 
> And I hate to add more problems, but here we go.
> 
> Under virt, it's not just SMI's which might run behind your back.
> Regular interrupts/etc can probably be hand-waved away in the same way
> that SMIs are.
> 
> Hypercalls however are a different matter.
> 
> Xen and HyperV both have hypercall pages, where the hypervisor provides
> some executable code for the guest kernel to use.
> 
> Under the current scheme, the calls into the hypercall pages get
> accounted, as objtool can see them, but the ret's don't.  This imbalance
> is exasperated because some hypercalls are called in loops.
> 
> Worse however, it opens a hole where branch history is calculable and
> the ret can reliably underflow.  This occurs when there's a minimal call
> depth in Linux to get to the hypercall, and then a call depth of >16 in
> the hypervisor.
> 
> The only variable in these cases is how much user control there is of
> the registers, and I for one am not feeling lucky in face of the current
> research.
> 
> The only solution I see here is for Linux to ret-thunk the hypercall
> page too.  Under Xen, the hypercall page is mutable by the guest and
> there is room to turn every ret into a jmp, but obviously none of this
> is covered by any formal ABI, and this probably needs more careful
> consideration than the short time I've put towards it.
> 
> That said, after a return from the hypervisor, Linux has no idea what
> state the RSB is in, so the only safe course of action is to re-stuff.
> 
> CC'ing the HyperV folk for input on their side.

In Hyper-V, the hypercall page is *not* writable by the guest.  Quoting
from Section 3.13 in the Hyper-V TLFS:

    The hypercall page appears as an "overlay" to the GPA space; that is,
    it covers whatever else is mapped to the GPA range. Its contents are
    readable and executable by the guest. Attempts to write to the
    hypercall page will result in a protection (#GP) exception.

And:

    After the interface has been established, the guest can initiate a
    hypercall. To do so, it populates the registers per the hypercall protocol
    and issues a CALL to the beginning of the hypercall page. The guest
    should assume the hypercall page performs the equivalent of a near
    return (0xC3) to return to the caller.  As such, the hypercall must be
    invoked with a valid stack.

Michael