linux-kernel - Re: [Question PATCH kernel] x86/amd/sev/nmi+vc: Fix stack handling (why is this happening?)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9PKmZZPEaRDp0FC@8bytes.org>
Date:   Fri, 27 Jan 2023 13:59:05 +0100
From:   Joerg Roedel <joro@...tes.org>
To:     Alexey Kardashevskiy <aik@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>, kvm@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Sean Christopherson <seanjc@...gle.com>,
        Jiri Kosina <jkosina@...e.cz>, Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Tom Lendacky <thomas.lendacky@....com>
Subject: Re: [Question PATCH kernel] x86/amd/sev/nmi+vc: Fix stack handling
 (why is this happening?)

On Fri, Jan 27, 2023 at 10:56:26PM +1100, Alexey Kardashevskiy wrote:
> Here is the complete output of that VM (200k so not in the email):
> 
> https://github.com/aik/linux/commit/d0d6bbb58fcd927ddd1f8e9d42ab121920c7eafc

Thanks. So looking at the code in the traces:

Code starting with the faulting instruction
===========================================
   0:	65 48 8b 04 25 c0 db 	mov    %gs:0x2dbc0,%rax
   7:	02 00
   9:	48 8b 80 a8 08 00 00 	mov    0x8a8(%rax),%rax
  10:	0f 0d 48 70          	prefetchw 0x70(%rax)
  14:	e8                   	.byte 0xe8
  15:	82                   	.byte 0x82

I think the fault in the page-fault handler happens here:

DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
{
        unsigned long address = read_cr2();
        irqentry_state_t state;

        prefetchw(&current->mm->mmap_lock); <--- Here

To be precise, it faults while dereferencing current. That means that
GS_BASE is likely broken, need to find out why...

This at least explains why it page-faults in a loop until the stack
overflows and the guard page is hit.

Regards,

	Joerg