[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5d792dc5-ea8e-46d2-8031-44f8e92b0188@suse.com>
Date: Tue, 7 Oct 2025 15:31:42 +0200
From: Jürgen Groß <jgross@...e.com>
To: Dave Hansen <dave.hansen@...el.com>,
"Reshetova, Elena" <elena.reshetova@...el.com>,
"Annapurve, Vishal" <vannapurve@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>, "hpa@...or.com" <hpa@...or.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>,
"x86@...nel.org" <x86@...nel.org>, "kas@...nel.org" <kas@...nel.org>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "Huang, Kai" <kai.huang@...el.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"Chatre, Reinette" <reinette.chatre@...el.com>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"ashish.kalra@....com" <ashish.kalra@....com>,
"nik.borisov@...e.com" <nik.borisov@...e.com>, "Gao, Chao"
<chao.gao@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
"Chen, Farrah" <farrah.chen@...el.com>, Binbin Wu <binbin.wu@...ux.intel.com>
Subject: Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX
partial write erratum
On 02.10.25 17:06, Dave Hansen wrote:
> On 10/2/25 00:46, Juergen Gross wrote:
>> So lets compare the 2 cases with kdump enabled and disabled in your
>> scenario (crash of the host OS):
>>
>> kdump enabled: No dump can be produced due to the #MC and system is
>> rebooted.
>>
>> kdump disabled: No dump is produced and system is rebooted after crash.
>>> What is the main concern with kdump enabled? I don't see any
>> disadvantage with enabling it, just the advantage that in many cases
>> a dump will be written.
> The disadvantage is that a kernel bug from long ago results in a machine
> check. Machine checks are generally indicative of bad hardware. So the
> disadvantage is that someone mistakes the long ago kernel bug for bad
> hardware.
>
> There are two ways of looking at this:
>
> 1. A theoretically fragile kdump is better than no kdump at all. All of
> the stars would have to align for kdump to _fail_ and we don't think
> that's going to happen often enough to matter.
> 2. kdump happens after kernel bugs. The machine checks happen because of
> kernel bugs. It's not a big stretch to think that, at scale, kdump is
> going to run in to these #MCs on a regular basis.
>
> Does that capture the two perspectives fairly?
Basically yes.
If we can't come to an agreement that kdump should be allowed in spite of
a potential #MC, maybe we could disable kdump only if TDX guests have been
active on the machine before? Disabling kdump on a distro kernel just because
TDX was enabled but without anyone having used TDX would be quite hard.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3684 bytes)
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists