[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <469754ab-d8ec-168a-15c7-61045a880792@amd.com>
Date: Wed, 26 Apr 2023 14:18:34 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: Dave Hansen <dave.hansen@...el.com>,
Tony Battersby <tonyb@...ernetics.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc: "H. Peter Anvin" <hpa@...or.com>,
Mario Limonciello <mario.limonciello@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH RFC] x86/cpu: fix intermittent lockup on poweroff
On 4/26/23 13:15, Dave Hansen wrote:
> On 4/26/23 10:51, Tom Lendacky wrote:
>>>> + /*
>>>> + * native_stop_other_cpus() will write to @stop_cpus_count after
>>>> + * observing that it went down to zero, which will invalidate the
>>>> + * cacheline on this CPU.
>>>> + */
>>>> + atomic_dec(&stop_cpus_count);
>>
>> This is probably going to pull in a cache line and cause the problem the
>> native_wbinvd() is trying to avoid.
>
> Is one _more_ cacheline really the problem?
The answer is it depends. If the cacheline ends up modified/dirty, then it
can be a problem.
>
> Or is having _any_ cacheline pulled in a problem? What about the text
> page containing the WBINVD? How about all the page table pages that are
> needed to resolve %RIP to a physical address?
It's been a while since I looked into all this, but text and page table
pages didn't present any problems because they weren't modified, but stack
memory was. Doing a plain wbinvd() resulted in calls to the paravirt
support and stack data from the call to wbinvd() ended up in some page
structs in the kexec kernel (applicable to zen1 and zen2). Using
native_wbinvd() eliminated the stack data changes after the WBINVD and
didn't end up with any corruption following a kexec.
>
> What about the mds_idle_clear_cpu_buffers() code that snuck into
> native_halt()?
Luckily that is all inline and using a static branch which isn't enabled
for AMD and should just jmp to the hlt, so no modified cache lines.
Thanks,
Tom
>
>> ffffffff810ede4c: 0f 09 wbinvd
>> ffffffff810ede4e: 8b 05 e4 3b a7 02 mov 0x2a73be4(%rip),%eax # ffffffff83b61a38 <mds_idle_clear>
>> ffffffff810ede54: 85 c0 test %eax,%eax
>> ffffffff810ede56: 7e 07 jle ffffffff810ede5f <stop_this_cpu+0x9f>
>> ffffffff810ede58: 0f 00 2d b1 75 13 01 verw 0x11375b1(%rip) # ffffffff82225410 <ds.6688>
>> ffffffff810ede5f: f4 hlt
>> ffffffff810ede60: eb ec jmp ffffffff810ede4e <stop_this_cpu+0x8e>
>> ffffffff810ede62: e8 59 40 1a 00 callq ffffffff81291ec0 <trace_hardirqs_off>
>> ffffffff810ede67: eb 85 jmp ffffffff810eddee <stop_this_cpu+0x2e>
>> ffffffff810ede69: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
>
Powered by blists - more mailing lists