[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALu+AoSZkq1kz-xjvHkkuJ3C71d0SM5ibEJurdgmkZqZvNp2dQ@mail.gmail.com>
Date: Wed, 28 Feb 2024 10:54:22 +0800
From: Dave Young <dyoung@...hat.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Tom Lendacky <thomas.lendacky@....com>, "Huang, Kai" <kai.huang@...el.com>,
"Gao, Chao" <chao.gao@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"luto@...nel.org" <luto@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>, "hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"nik.borisov@...e.com" <nik.borisov@...e.com>, "bhe@...hat.com" <bhe@...hat.com>
Subject: Re: [PATCH 1/4] x86/coco: Add a new CC attribute to unify cache flush
during kexec
On Fri, 23 Feb 2024 at 18:41, Dave Young <dyoung@...hat.com> wrote:
>
> On Wed, 21 Feb 2024 at 17:33, Borislav Petkov <bp@...en8.de> wrote:
> >
> > On Tue, Feb 20, 2024 at 04:30:13PM -0600, Tom Lendacky wrote:
> > > I believe the issues were that different Intel systems would hang or reset
> > > and it was bisected to that commit that added the WBINVD. It was a while
> > > ago, but I remember that they were similar to what the 1f5e7eb7868e commit
> > > ended up fixing, which was debugged because sometimes the WBINVD was still
> > > occasionally issued resulting in the following patch
> > >
> > > 9b040453d444 ("x86/smp: Dont access non-existing CPUID leaf")
> > >
> > > It just means that if we go to an unconditional WBINVD, then we need to be
> > > careful.
> >
> > Let's try it.
> >
> > Dave, do you remember what issues
> >
> > f23d74f6c66c ("x86/mm: Rework wbinvd, hlt operation in stop_this_cpu()")
> >
> > fixed?
>
> It should be a kexec reboot failure describe in below thread:
> https://lore.kernel.org/lkml/20180117072123.GA1866@dhcp-128-65.nay.redhat.com/
>
> >
> > If so, can you try the below diff ontop of latest tip/master to see if
> > those issues would reappear?
>
> It was reproduced on an old laptop (Thinkpad t440s or t480s, I can not
> remember), but I have replaced them with a new different one. I tried
> the latest tip-master with the if condition commented out, kexec
> reboot works fine.
>
> Let me try to find an old laptop to see if I can do more tests, will
> get back later next week.
Update: tested on an old laptop as well, I did not find any problems
with unconditional native_wbinvd(), kexec and kdump works fine.
>
> >
> > Thx.
> >
> > ---
> >
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index ab49ade31b0d..ec4dcc9f70ca 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -824,8 +824,7 @@ void __noreturn stop_this_cpu(void *dummy)
> > * Test the CPUID bit directly because the machine might've cleared
> > * X86_FEATURE_SME due to cmdline options.
> > */
> > - if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0)))
> > - native_wbinvd();
> > + native_wbinvd();
> >
> > /*
> > * This brings a cache line back and dirties it, but
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette
> >
Powered by blists - more mailing lists