lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231017111817.GAZS5teT4rFkXVD2KA@fat_crate.local>
Date:   Tue, 17 Oct 2023 13:18:17 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     "Li, Zhiquan1" <zhiquan1.li@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "patches@...ts.linux.dev" <patches@...ts.linux.dev>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "naoya.horiguchi@....com" <naoya.horiguchi@....com>
Subject: Re: [PATCH v3] x86/mce: Set PG_hwpoison page flag to avoid the
 capture kernel panic

On Tue, Oct 17, 2023 at 01:24:53AM +0000, Luck, Tony wrote:
> How about:
>
> When there is a fatal machine check Linux calls mce_panic()
> without checking to see if bad data at some memory address
> was reported in the machine check banks.

... for the simple reason that the kernel cannot allow itself to do any
unnecessary work but panic immediately so that it can stop the
propagation of bad data.

Now, it's a whole different story whether that's the right thing to do
and whether the data has already propagated so that the panic is moot.

The whole point I'm trying to make is that the machine panics because
the error severity dictates it to do so. And there's no opportunity to
queue recovery work because it simply cannot in that case. So the commit
message should simply state that we're marking the page as poison for
the kexec'ed kernel's sake and not because of anything else.

> If kexec is enabled, check for memory errors and mark the
> page as poisoned so that the kexec'd kernel can avoid accessing
> the page.

Yap, yours makes sense.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ