linux-kernel - Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <58A53A65.3000405@redhat.com>
Date:   Thu, 16 Feb 2017 13:36:37 +0800
From:   Xunlei Pang <xpang@...hat.com>
To:     Borislav Petkov <bp@...en8.de>, xlpang@...hat.com
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        kexec@...ts.infradead.org, Tony Luck <tony.luck@...el.com>,
        Ingo Molnar <mingo@...hat.com>, Dave Young <dyoung@...hat.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Junichi Nomura <j-nomura@...jp.nec.com>,
        Kiyoshi Ueda <k-ueda@...jp.nec.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Subject: Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after
 system panic

On 01/26/2017 at 02:44 PM, Borislav Petkov wrote:
> On Thu, Jan 26, 2017 at 02:30:02PM +0800, Xunlei Pang wrote:
>> The hardware machine check is hard to reproduce, but the mce code of
>> RHEL7 is quite the same as that of tip/master, anyway we are able to
>> inject software mce to reproduce it.
> Please give me your exact steps so that I can try to reproduce it here
> too.
>

Hi Borislav,

I tried to use qemu to inject SRAO("mce -b 0 0 0xb100000000000000 0x5 0x0 0x0"),
it works well in 1st kernel, but it doesn't work for 1st kernel after kdump boots(seems
the cpus remain in 1st kernel don't respond to the simulated broadcasting mce).

But in theory, we know cpus belong to kdump kernel can't respond to the
old mce handler, so a single SRAO injection in 1st kernel should be similar.
For example, I used "... -smp 2 -cpu Haswell" to launch a simulation with broadcast
mce supported, and inject SRAO to cpu0 only through qemu monitor
"mce 0 0 0xb100000000000000 0x5 0x0 0x0", cpu0 will timeout/panic and reboot
the machine as follows(running on linux-4.9):
  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
  Kernel Offset: disabled
  Rebooting in 30 seconds..

Regards,
Xunlei