lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZaSQn7RCRTaBK1bc@chao-email>
Date: Mon, 15 Jan 2024 09:55:43 +0800
From: Chao Gao <chao.gao@...el.com>
To: "Yang, Weijiang" <weijiang.yang@...el.com>
CC: Sean Christopherson <seanjc@...gle.com>, Rick P Edgecombe
	<rick.p.edgecombe@...el.com>, Dave Hansen <dave.hansen@...el.com>,
	"peterz@...radead.org" <peterz@...radead.org>, "john.allen@....com"
	<john.allen@....com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "mlevitsk@...hat.com"
	<mlevitsk@...hat.com>
Subject: Re: [PATCH v8 00/26] Enable CET Virtualization

On Thu, Jan 11, 2024 at 10:56:55PM +0800, Yang, Weijiang wrote:
>On 1/9/2024 11:10 PM, Sean Christopherson wrote:
>> On Mon, Jan 08, 2024, Weijiang Yang wrote:
>> > On 1/6/2024 12:21 AM, Sean Christopherson wrote:
>> > > On Fri, Jan 05, 2024, Weijiang Yang wrote:
>> > > > On 1/5/2024 8:54 AM, Sean Christopherson wrote:
>> > > > > On Fri, Jan 05, 2024, Rick P Edgecombe wrote:
>> > > > > > > For CALL/RET (and presumably any branch instructions with IBT?) other
>> > > > > > > instructions that are directly affected by CET, the simplest thing would
>> > > > > > > probably be to disable those in KVM's emulator if shadow stacks and/or IBT
>> > > > > > > are enabled, and let KVM's failure paths take it from there.
>> > > > > > Right, that is what I was wondering might be the normal solution for
>> > > > > > situations like this.
>> > > > > If KVM can't emulate something, it either retries the instruction (with some
>> > > > > decent logic to guard against infinite retries) or punts to userspace.
>> > > > What kind of error is proper if KVM has to punt to userspace?
>> > > KVM_INTERNAL_ERROR_EMULATION.  See prepare_emulation_failure_exit().
>> > > 
>> > > > Or just inject #UD into guest on detecting this case?
>> > > No, do not inject #UD or do anything else that deviates from architecturally
>> > > defined behavior.
>> > Thanks!
>> > But based on current KVM implementation and patch 24, seems that if CET is exposed
>> > to guest, the emulation code or shadow paging mode couldn't be activated at the same time:
>> No, requiring unrestricted guest only disables the paths where KVM *delibeately*
>> emulates the entire guest code stream.  In no way, shape, or form does it prevent
>> KVM from attempting to emulate arbitrary instructions.
>
>Yes, also need to prevent sporadic emulation, how about adding below patch in emulator?
>
>
>diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>index e223043ef5b2..e817d8560ceb 100644
>--- a/arch/x86/kvm/emulate.c
>+++ b/arch/x86/kvm/emulate.c
>@@ -178,6 +178,7 @@
> #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
> #define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
> #define IsBranch    ((u64)1 << 56)  /* Instruction is considered a branch. */
>+#define IsProtected ((u64)1 << 57)  /* Instruction is protected by CET. */
>
> #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
>
>@@ -4098,9 +4099,9 @@ static const struct opcode group4[] = {
> static const struct opcode group5[] = {
>        F(DstMem | SrcNone | Lock,              em_inc),
>        F(DstMem | SrcNone | Lock,              em_dec),
>-       I(SrcMem | NearBranch | IsBranch,       em_call_near_abs),
>-       I(SrcMemFAddr | ImplicitOps | IsBranch, em_call_far),
>-       I(SrcMem | NearBranch | IsBranch,       em_jmp_abs),
>+       I(SrcMem | NearBranch | IsBranch | IsProtected, em_call_near_abs),
>+       I(SrcMemFAddr | ImplicitOps | IsBranch | IsProtected, em_call_far),
>+       I(SrcMem | NearBranch | IsBranch | IsProtected, em_jmp_abs),
>        I(SrcMemFAddr | ImplicitOps | IsBranch, em_jmp_far),
>        I(SrcMem | Stack | TwoMemOp,            em_push), D(Undefined),
> };
>@@ -4362,11 +4363,11 @@ static const struct opcode opcode_table[256] = {
>        /* 0xC8 - 0xCF */
>        I(Stack | SrcImmU16 | Src2ImmByte | IsBranch, em_enter),
>        I(Stack | IsBranch, em_leave),
>-       I(ImplicitOps | SrcImmU16 | IsBranch, em_ret_far_imm),
>-       I(ImplicitOps | IsBranch, em_ret_far),
>-       D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch, intn),
>+       I(ImplicitOps | SrcImmU16 | IsBranch | IsProtected, em_ret_far_imm),
>+       I(ImplicitOps | IsBranch | IsProtected, em_ret_far),
>+       D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch | IsProtected, intn),
>        D(ImplicitOps | No64 | IsBranch),
>-       II(ImplicitOps | IsBranch, em_iret, iret),
>+       II(ImplicitOps | IsBranch | IsProtected, em_iret, iret),
>        /* 0xD0 - 0xD7 */
>        G(Src2One | ByteOp, group2), G(Src2One, group2),
>        G(Src2CL | ByteOp, group2), G(Src2CL, group2),
>@@ -4382,7 +4383,7 @@ static const struct opcode opcode_table[256] = {
>        I2bvIP(SrcImmUByte | DstAcc, em_in,  in,  check_perm_in),
>        I2bvIP(SrcAcc | DstImmUByte, em_out, out, check_perm_out),
>        /* 0xE8 - 0xEF */
>-       I(SrcImm | NearBranch | IsBranch, em_call),
>+       I(SrcImm | NearBranch | IsBranch | IsProtected, em_call),
>        D(SrcImm | ImplicitOps | NearBranch | IsBranch),
>        I(SrcImmFAddr | No64 | IsBranch, em_jmp_far),
>        D(SrcImmByte | ImplicitOps | NearBranch | IsBranch),
>@@ -4401,7 +4402,7 @@ static const struct opcode opcode_table[256] = {
> static const struct opcode twobyte_table[256] = {
>        /* 0x00 - 0x0F */
>        G(0, group6), GD(0, &group7), N, N,
>-       N, I(ImplicitOps | EmulateOnUD | IsBranch, em_syscall),
>+       N, I(ImplicitOps | EmulateOnUD | IsBranch | IsProtected, em_syscall),
>        II(ImplicitOps | Priv, em_clts, clts), N,
>        DI(ImplicitOps | Priv, invd), DI(ImplicitOps | Priv, wbinvd), N, N,
>        N, D(ImplicitOps | ModRM | SrcMem | NoAccess), N, N,
>@@ -4432,8 +4433,8 @@ static const struct opcode twobyte_table[256] = {
>        IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc),
>        II(ImplicitOps | Priv, em_rdmsr, rdmsr),
>        IIP(ImplicitOps, em_rdpmc, rdpmc, check_rdpmc),
>-       I(ImplicitOps | EmulateOnUD | IsBranch, em_sysenter),
>-       I(ImplicitOps | Priv | EmulateOnUD | IsBranch, em_sysexit),
>+       I(ImplicitOps | EmulateOnUD | IsBranch | IsProtected, em_sysenter),
>+       I(ImplicitOps | Priv | EmulateOnUD | IsBranch | IsProtected, em_sysexit),
>        N, N,
>        N, N, N, N, N, N, N, N,
>        /* 0x40 - 0x4F */
>@@ -4971,6 +4972,12 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
>        if (ctxt->d == 0)
>                return EMULATION_FAILED;
>+       if ((opcode.flags & IsProtected) &&
>+           (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_CET)) {

CR4.CET doesn't necessarily mean IBT or shadow stack is enabled. why not check
CPL and IA32_S/U_CET?

>+               WARN_ONCE(1, "CET is active, emulation aborted.\n");

remove this WARN_ONCE(). Guest can trigger this at will and overflow host dmesg.

if you really want to tell usespace the emulation_failure is due to CET, maybe
you can add a new flag like KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES.
for now, I won't bother to add this because probably userspace just terminates
the VM on any instruction failure (i.e., won't try to figure out the reason of
the instruction failure and fix it).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ