linux-kernel - Re: [PATCH v3 0/9] Parallel CPU bringup for x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7e92a196e67b1bfa37c1e61a789f2b75a735c06f.camel@infradead.org>
Date:   Fri, 28 Jan 2022 09:54:45 +0000
From:   David Woodhouse <dwmw2@...radead.org>
To:     Tom Lendacky <thomas.lendacky@....com>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "mimoja@...oja.de" <mimoja@...oja.de>,
        "hewenliang4@...wei.com" <hewenliang4@...wei.com>,
        "hushiyuan@...wei.com" <hushiyuan@...wei.com>,
        "luolongjun@...wei.com" <luolongjun@...wei.com>,
        "hejingxian@...wei.com" <hejingxian@...wei.com>
Subject: Re: [PATCH v3 0/9] Parallel CPU bringup for x86_64

On Fri, 2021-12-17 at 14:55 -0600, Tom Lendacky wrote:
> On 12/17/21 2:13 PM, David Woodhouse wrote:
> > On Fri, 2021-12-17 at 13:46 -0600, Tom Lendacky wrote:
> > > There's no WARN or PANIC, just a reset. I can look to try and capture some
> > > KVM trace data if that would help. If so, let me know what events you'd
> > > like captured.
> > 
> > 
> > Could start with just kvm_run_exit?
> > 
> > Reason 8 would be KVM_EXIT_SHUTDOWN and would potentially indicate a
> > triple fault.
> 
> qemu-system-x86-24093   [005] .....  1601.759486: kvm_exit: vcpu 112 reason shutdown rip 0xffffffff81070574 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x80000b08 error_code 0x00000000
> 
> # addr2line -e woodhouse-build-x86_64/vmlinux 0xffffffff81070574
> /root/kernels/woodhouse-build-x86_64/./arch/x86/include/asm/desc.h:272
> 
> Which is: asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));

So, I remain utterly bemused by this, and the Milan *guests* I have
access to can't even kexec with a stock kernel; that is also "too fast"
and they take a triple fault during the bringup in much the same way —
even without my parallel patches, and even going back to fairly old
kernels.

I wasn't able to follow up with raw serial output during the bringup to
pinpoint precisely where it happens, because the VM would tear itself
down in response to the triple fault without actually flushing the last
virtual serial output :)

It would be really useful to get access to a suitable host where I can
spawn this in qemu and watch it fail. I am suspecting a chip-specific
quirk or bug at this point.

I might suggest in the short term that we could unblock the parallel
bringup work by just not doing it for affected chips... but that won't
make existing kexec work.



Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)