linux-kernel - Re: [Question PATCH kernel] x86/amd/sev/nmi+vc: Fix stack handling (why is this happening?)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9UolYXFzvocxIcn@8bytes.org>
Date:   Sat, 28 Jan 2023 14:52:21 +0100
From:   Joerg Roedel <joro@...tes.org>
To:     Alexey Kardashevskiy <aik@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>, kvm@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Sean Christopherson <seanjc@...gle.com>,
        Jiri Kosina <jkosina@...e.cz>, Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Tom Lendacky <thomas.lendacky@....com>
Subject: Re: [Question PATCH kernel] x86/amd/sev/nmi+vc: Fix stack handling
 (why is this happening?)

On Sat, Jan 28, 2023 at 10:24:56PM +1100, Alexey Kardashevskiy wrote:
> (out of curiosity) where do you see these NOPs? "objdump -D vmlinux" does
> not show any, is this after lifepatching?

Here is the disassembly of exc_nmi of a kernel built from tip/master
with CONFIG_PARAVIRT=n:

<exc_nmi>:
       41 54                   push   %r12
       55                      push   %rbp
       48 89 fd                mov    %rdi,%rbp
       53                      push   %rbx
       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
       65 8b 05 69 66 41 7e    mov    %gs:0x7e416669(%rip),%eax        # 3254c <pcpu_hot+0xc>
       48 98                   cltq
       48 0f a3 05 33 00 2b    bt     %rax,0x12b0033(%rip)        # ffffffff82ecbf20 <__cpu_online_mask>
       01 
       0f 83 c9 00 00 00       jae    ffffffff81c1bfbc <exc_nmi+0xec>
       65 8b 05 f6 41 40 7e    mov    %gs:0x7e4041f6(%rip),%eax        # 200f0 <nmi_state>
       85 c0                   test   %eax,%eax
       0f 85 f8 00 00 00       jne    ffffffff81c1bffa <exc_nmi+0x12a>
       65 c7 05 e3 41 40 7e    movl   $0x1,%gs:0x7e4041e3(%rip)        # 200f0 <nmi_state>
       01 00 00 00 
       0f 20 d0                mov    %cr2,%rax
       65 48 89 05 d0 41 40    mov    %rax,%gs:0x7e4041d0(%rip)        # 200e8 <nmi_cr2>
       7e 
       41 0f 21 fc             mov    %db7,%r12			<-- here is the DR7 read
       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)		<-- here are the NOPS that become a
       								    call to sev_es_ist_enter() in
								    SEV-ES guests

The DR7 read will cause a #VC exception, switching to the #VC IST stack.
If the NMI was raised while already on the #VC IST stack, this DR7 read
will overwrite the previous stack frame and cause stack recursion, with
all funny side effects.


> diff --git a/arch/x86/include/asm/debugreg.h
> b/arch/x86/include/asm/debugreg.h
> index b049d950612f..687b15297057 100644
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -39,7 +39,7 @@ static __always_inline unsigned long
> native_get_debugreg(int regno)
>                 asm("mov %%db6, %0" :"=r" (val));
>                 break;
>         case 7:
> -               asm("mov %%db7, %0" :"=r" (val));
> +               asm volatile ("mov %%db7, %0" :"=r" (val));

Yeah, something like this will be the fix. I am still thinking about
the right place to put the volatile to make it explicit to the situation
we are encountering here (which is SEV-ES specific).

Best would be an explicit barrier in C code between sev_es_ist_enter()
and the DR7 read, but all barriers I tried to far only seem to affect
memory instructions and had no influence on the DR7 read (which is
obviously not considered as a memory read by the compiler).

The best place to put the barrier is in the sev_es_ist_enter() inline
function, right after the static_call to __sev_es_ist_enter().

Regards,

	Joerg