linux-kernel - Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5886B33B.7080601@redhat.com>
Date:   Tue, 24 Jan 2017 09:51:55 +0800
From:   Xunlei Pang <xpang@...hat.com>
To:     Borislav Petkov <bp@...en8.de>, "Luck, Tony" <tony.luck@...el.com>
Cc:     xlpang@...hat.com, x86@...nel.org, linux-kernel@...r.kernel.org,
        kexec@...ts.infradead.org, Ingo Molnar <mingo@...hat.com>,
        Dave Young <dyoung@...hat.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Junichi Nomura <j-nomura@...jp.nec.com>,
        Kiyoshi Ueda <k-ueda@...jp.nec.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Subject: Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after
 system panic

On 01/24/2017 at 09:46 AM, Xunlei Pang wrote:
> On 01/24/2017 at 01:51 AM, Borislav Petkov wrote:
>> Hey Tony,
>>
>> a "welcome back" is in order? :-)
>>
>> On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote:
>>> If the system had experienced some memory corruption, but
>>> recovered ... then there would be some pages sitting around
>>> that the old kernel had marked as POISON and stopped using.
>>> The kexec'd kernel doesn't know about these, so may touch that
>>> memory while taking a crash dump ...
>> Hmm, pass a list of poisoned pages to the kdump kernel so as not to
>> touch. Looks like there's already functionality for that:
>>
>> "makedumpfile can exclude the following types of pages while copying
>> VMCORE to DUMPFILE, and a user can choose which type of pages will be
>> excluded.
>>
>> - Pages filled with zero
>> - Cache pages
>> - User process data pages
>> - Free pages"
>>
>>  (there is a makedumpfile manpage somewhere)
>>
>> And apparently crash knows about poisoned pages and handles them:
>>
>> static int __init crash_save_vmcoreinfo_init(void)
>> {
>> 	...
>> #ifdef CONFIG_MEMORY_FAILURE
>>         VMCOREINFO_NUMBER(PG_hwpoison);
>> #endif
>>
>> so if that works, the kexeced kernel should know about that list.
> From the log in my previous reply, MCE occurred before makedumpfile dumping,
> so I guess if the poisoned ones belong to the crash reserved memory or other
> type of events?

Another possibility may be from any system.reserved/pcie memory
which are shared between 1st and 2nd kernel.

>
> Besides, some kdump kernel may not use makedumpfile, for example a simple "cp"
> is also allowed to process "/proc/vmcore".
>
>>> and then you have a broadcast machine check (on older[1] Intel CPUs
>>> that don't support local machine check).
>> Right.
>>
>>> This is hard to work around. You really need all the CPUs to have set
>>> CR4.MCE=1 (if any didn't, then they will force a reset when they see
>>> the machine check). Also you need to make sure that they jump to the
>>> copy of do_machine_check() in the new kernel, not the old kernel.
>> Doesn't matter, right? The new copy is as clueless as the old one about
>> those MCEs.
>>
> It's the code in mce_start(), it waits for all the online cpus including the cpus
> that kdump boots on to synchronize.
>
> So for new mce handler of kdump kernel, it is fine as the number of online cpus
> is correct; as for old mce handler of 1st kernel, it's not true because some cpus
> which are regarded online from 1st kernel's view are running the 2nd kernel now,
> they can't respond to the old mce handler which will timeout the old mce handler.
>
> Regards,
> Xunlei