lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 24 Jan 2017 09:27:45 +0800
From:   Xunlei Pang <xpang@...hat.com>
To:     Borislav Petkov <bp@...en8.de>, xlpang@...hat.com
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        kexec@...ts.infradead.org, Tony Luck <tony.luck@...el.com>,
        Ingo Molnar <mingo@...hat.com>, Dave Young <dyoung@...hat.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Junichi Nomura <j-nomura@...jp.nec.com>,
        Kiyoshi Ueda <k-ueda@...jp.nec.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Subject: Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after
 system panic

On 01/23/2017 at 10:50 PM, Borislav Petkov wrote:
> On Mon, Jan 23, 2017 at 09:35:53PM +0800, Xunlei Pang wrote:
>> One possible timing sequence would be:
>> 1st kernel running on multiple cpus panicked
>> then the crash dump code starts
>> the crash dump code stops the others cpus except the crashing one
>> 2nd kernel boots up on the crash cpu with "nr_cpus=1"
>> some broadcasted mce comes on some cpu amongst the other cpus(not the crashing cpu)
> Where does this broadcasted MCE come from?
>
> The crash dump code triggered it? Or it happened before the panic()?
>
> Are you talking about an *actual* sequence which you're experiencing on
> real hw or is this something hypothetical?
>

It occurred on real hardware when testing crash dump.

1) SysRq-c was injected for the test in 1st kernel
[ 49.897279] SysRq : Trigger a crash 2) The 2nd kernel started for kdump
   [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.10.0-229.el7.x86_64 root=UUID=976a15c8-8cbe-44ad-bb91-23f9b18e8789 ro console=ttyS1,115200 nmi_watchdog=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug disable_cpu_apicid=0 elfcorehdr=869772K 3) An MCE came to the 1st kernel, timeout panic occurred, and rebooted the machine
    [    6.095706] Dazed and confused, but trying to continue  // message of the 1st kernel
    [   81.655507] Kernel panic - not syncing: Timeout synchronizing machine check over CPUs
    [   82.729324] Shutting down cpus with NMI
    [   82.774539] drm_kms_helper: panic occurred, switching back to text console
    [   82.782257] Rebooting in 10 seconds..

Please see the attached for the full log. Regards, Xunlei


View attachment "dmesg.txt" of type "text/plain" (27414 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ