[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27d19ea5-d078-405b-a963-91d19b4229c8@suse.com>
Date: Thu, 2 Oct 2025 09:46:54 +0200
From: Juergen Gross <jgross@...e.com>
To: "Reshetova, Elena" <elena.reshetova@...el.com>,
 "Annapurve, Vishal" <vannapurve@...gle.com>,
 "Hansen, Dave" <dave.hansen@...el.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>,
 "tglx@...utronix.de" <tglx@...utronix.de>,
 "peterz@...radead.org" <peterz@...radead.org>,
 "mingo@...hat.com" <mingo@...hat.com>, "hpa@...or.com" <hpa@...or.com>,
 "thomas.lendacky@....com" <thomas.lendacky@....com>,
 "x86@...nel.org" <x86@...nel.org>, "kas@...nel.org" <kas@...nel.org>,
 "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
 "dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "Huang, Kai" <kai.huang@...el.com>,
 "seanjc@...gle.com" <seanjc@...gle.com>,
 "Chatre, Reinette" <reinette.chatre@...el.com>,
 "Yamahata, Isaku" <isaku.yamahata@...el.com>,
 "Williams, Dan J" <dan.j.williams@...el.com>,
 "ashish.kalra@....com" <ashish.kalra@....com>,
 "nik.borisov@...e.com" <nik.borisov@...e.com>, "Gao, Chao"
 <chao.gao@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
 "Chen, Farrah" <farrah.chen@...el.com>, Binbin Wu <binbin.wu@...ux.intel.com>
Subject: Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX
 partial write erratum
On 02.10.25 08:59, Reshetova, Elena wrote:
>> On Wed, Oct 1, 2025 at 7:32 AM Dave Hansen <dave.hansen@...el.com>
>> wrote:
>>>
>>> On 9/30/25 19:05, Vishal Annapurve wrote:
>>> ...
>>>>> Any workarounds are going to be slow and probably imperfect. That's not
>>>>
>>>> Do we really need to deploy workarounds that are complex and slow to
>>>> get kdump working for the majority of the scenarios? Is there any
>>>> analysis done for the risk with imperfect and simpler workarounds vs
>>>> benefits of kdump functionality?
>>>>
>>>>> a great match for kdump. I'm perfectly happy waiting for fixed hardware
>>>>> from what I've seen.
>>>>
>>>> IIUC SPR/EMR - two CPU generations out there are impacted by this
>>>> erratum and just disabling kdump functionality IMO is not the best
>>>> solution here.
>>>
>>> That's an eminently reasonable position. But we're speaking in broad
>>> generalities and I'm unsure what you don't like about the status quo or
>>> how you'd like to see things change.
>>
>> Looks like the decision to disable kdump was taken between [1] -> [2].
>> "The kernel currently doesn't track which page is TDX private memory.
>> It's not trivial to reset TDX private memory.  For simplicity, this
>> series simply disables kexec/kdump for such platforms.  This will be
>> enhanced in the future."
>>
>> A patch [3] from the series[1], describes the issue as:
>> "This problem is triggered by "partial" writes where a write transaction
>> of less than cacheline lands at the memory controller.  The CPU does
>> these via non-temporal write instructions (like MOVNTI), or through
>> UC/WC memory mappings.  The issue can also be triggered away from the
>> CPU by devices doing partial writes via DMA."
>>
>> And also mentions:
>> "Also note only the normal kexec needs to worry about this problem, but
>> not the crash kexec: 1) The kdump kernel only uses the special memory
>> reserved by the first kernel, and the reserved memory can never be used
>> by TDX in the first kernel; 2) The /proc/vmcore, which reflects the
>> first (crashed) kernel's memory, is only for read.  The read will never
>> "poison" TDX memory thus cause unexpected machine check (only partial
>> write does)."
> 
> While the statement that the read will never poison the memory is correct,
> the situation we can theoretically worry about is the following in my understanding:
> 
> 1. During its execution on platform with partial write problem, host OS or other
> actor executing outside of SEAM mode triggers partial write into a cache line that
> originally belonged to TDX private memory.
> This is smth that host OS or other entities should not do, but it could happen due
> to host OS bugs, etc.
> 2. The above causes the specified cache line to be poisoned by mem controller.
> However, here we assume that no one accesses this cache line from TDX module,
> TD guests or Host OS for the time being and the problem remains hidden.
> 3. Host OS crashes due to some other issue, kdump crash kernel is triggered,
> and kdump starts to read all the memory from the previous host kernel to dump
> the diagnostics info.
> 4. At some point of time, kdump crash kernel reaches the memory with the poisoned
> cache line, consumes poison, and the #MC is issued for the kernel space.
> 
> Isn't this the reason for also disabling kdump? Or do I miss smth?
So lets compare the 2 cases with kdump enabled and disabled in your scenario
(crash of the host OS):
kdump enabled: No dump can be produced due to the #MC and system is rebooted.
kdump disabled: No dump is produced and system is rebooted after crash.
What is the main concern with kdump enabled? I don't see any disadvantage with
enabling it, just the advantage that in many cases a dump will be written.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3684 bytes)
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists
 
