[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGtprH_sedWE_MYmfp3z3RKY_Viq1GGV4qiA0H5g2g=W9LwiXA@mail.gmail.com>
Date: Sat, 18 Oct 2025 08:54:42 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Juergen Gross <jgross@...e.com>, "Reshetova, Elena" <elena.reshetova@...el.com>,
Paolo Bonzini <pbonzini@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"bp@...en8.de" <bp@...en8.de>, "tglx@...utronix.de" <tglx@...utronix.de>,
"peterz@...radead.org" <peterz@...radead.org>, "mingo@...hat.com" <mingo@...hat.com>, "hpa@...or.com" <hpa@...or.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "x86@...nel.org" <x86@...nel.org>,
"kas@...nel.org" <kas@...nel.org>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "Huang, Kai" <kai.huang@...el.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "Chatre, Reinette" <reinette.chatre@...el.com>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>, "Williams, Dan J" <dan.j.williams@...el.com>,
"ashish.kalra@....com" <ashish.kalra@....com>, "nik.borisov@...e.com" <nik.borisov@...e.com>,
"Gao, Chao" <chao.gao@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
"Chen, Farrah" <farrah.chen@...el.com>, Binbin Wu <binbin.wu@...ux.intel.com>
Subject: Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX
partial write erratum
On Thu, Oct 2, 2025 at 9:09 AM Vishal Annapurve <vannapurve@...gle.com> wrote:
>
> On Thu, Oct 2, 2025 at 8:06 AM Dave Hansen <dave.hansen@...el.com> wrote:
> >
> > On 10/2/25 00:46, Juergen Gross wrote:
> > > So lets compare the 2 cases with kdump enabled and disabled in your
> > > scenario (crash of the host OS):
> > >
> > > kdump enabled: No dump can be produced due to the #MC and system is
> > > rebooted.
> > >
> > > kdump disabled: No dump is produced and system is rebooted after crash.
> > > > What is the main concern with kdump enabled? I don't see any
> > > disadvantage with enabling it, just the advantage that in many cases
> > > a dump will be written.
> > The disadvantage is that a kernel bug from long ago results in a machine
> > check. Machine checks are generally indicative of bad hardware. So the
> > disadvantage is that someone mistakes the long ago kernel bug for bad
> > hardware.
> >
> > There are two ways of looking at this:
> >
> > 1. A theoretically fragile kdump is better than no kdump at all. All of
> > the stars would have to align for kdump to _fail_ and we don't think
> > that's going to happen often enough to matter.
> > 2. kdump happens after kernel bugs. The machine checks happen because of
> > kernel bugs. It's not a big stretch to think that, at scale, kdump is
> > going to run in to these #MCs on a regular basis.
>
> Looking at Elena's response, I would say it's still *a* big stretch
> for kdump to run into these #MCs on a regular basis as following
> sequence is needed for problematic scenario:
> 1) Host OS bug should corrupt TDX private memory with a *partial
> write*, that is part of kernel memory.
> -> i.e. PAMT tables, SEPT tables, TD VCPU/VM metadata etc.
> -> IIUC corruption of guest memory is not a concern as that
> belongs to userspace.
> 2) TDX Module/TD shouldn't consume that poisoned memory.
> -> i.e. no walk of the metadata memory.
> 3) Host kernel needs to generate a bug that causes an orthogonal panic.
>
> *partial writes* IIUC need special instructions.
Circling bank on this topic, I would like to iterate a few points:
1) Google has been running workloads with the series [1] for ~2 years
now, we haven't seen any issues with kdump functionality across kernel
bugs, real hardware issues, private memory corruption etc.
2) IMO rather than disabling kdump because of host kernel bugs
potentially corrupting private memory, it would be much more useful to
employ mechanisms like direct map removal to ensure host bugs leading
to private memory corruption are caught much early on. Disabling kdump
doesn't help the problem here and just makes it worse for a vast
majority of other scenarios. On the other hand, enabling kdump doesn't
make the problem worse than it is.
- Host IOMMU mappings should also be ideally restricted to the
regions that don't overlap with private memory regions.
3) With DPAMT support [2], the possibility of the host corrupting
private memory will reduce for the hosts not running confidential VMs
at all.
[1] https://lore.kernel.org/lkml/cover.1727179214.git.kai.huang@intel.com/
[2] https://lore.kernel.org/kvm/20250918232224.2202592-1-rick.p.edgecombe@intel.com/
Powered by blists - more mailing lists