[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aefed99b-6747-5dcc-65ec-6880f7c0d207@citrix.com>
Date: Wed, 11 Jan 2023 11:39:57 +0000
From: Andrew Cooper <Andrew.Cooper3@...rix.com>
To: Peter Zijlstra <peterz@...radead.org>,
Joan Bruguera <joanbrugueram@...il.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Juergen Gross <jgross@...e.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
xen-devel <xen-devel@...ts.xenproject.org>,
Jan Beulich <jbeulich@...e.com>,
Roger Pau Monne <roger.pau@...rix.com>
Subject: Re: Wake-up from suspend to RAM broken under `retbleed=stuff`
On 11/01/2023 11:20 am, Peter Zijlstra wrote:
> On Mon, Jan 09, 2023 at 04:05:31AM +0000, Joan Bruguera wrote:
>> This fixes wakeup for me on both QEMU and real HW
>> (just a proof of concept, don't merge)
>>
>> diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c
>> index ffea98f9064b..8704bcc0ce32 100644
>> --- a/arch/x86/kernel/callthunks.c
>> +++ b/arch/x86/kernel/callthunks.c
>> @@ -7,6 +7,7 @@
>> #include <linux/memory.h>
>> #include <linux/moduleloader.h>
>> #include <linux/static_call.h>
>> +#include <linux/suspend.h>
>>
>> #include <asm/alternative.h>
>> #include <asm/asm-offsets.h>
>> @@ -150,6 +151,10 @@ static bool skip_addr(void *dest)
>> if (dest >= (void *)hypercall_page &&
>> dest < (void*)hypercall_page + PAGE_SIZE)
>> return true;
>> +#endif
>> +#ifdef CONFIG_PM_SLEEP
>> + if (dest == restore_processor_state)
>> + return true;
>> #endif
>> return false;
>> }
>> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
>> index 236447ee9beb..e667894936f7 100644
>> --- a/arch/x86/power/cpu.c
>> +++ b/arch/x86/power/cpu.c
>> @@ -281,6 +281,9 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
>> /* Needed by apm.c */
>> void notrace restore_processor_state(void)
>> {
>> + /* Restore GS before calling anything to avoid crash on call depth accounting */
>> + native_wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
>> +
>> __restore_processor_state(&saved_context);
>> }
> Yeah, I can see why, but I'm not really comfortable with this. TBH, I
> don't see how the whole resume code is correct to begin with. At the
> very least it needs a heavy dose of noinstr.
>
> Rafael, what cr3 is active when we call restore_processor_state()?
>
> Specifically, the problem is that I don't feel comfortable doing any
> sort of weird code until all the CR and segment registers have been
> restored, however, write_cr*() are paravirt functions that result in
> CALL, which then gives us a bit of a checken and egg problem.
>
> I'm also wondering how well retbleed=stuff works on Xen, if at all. If
> we can ignore Xen, things are a little earier perhaps.
I really would like retbleed=stuff to work under Xen PV, because then we
can arrange to start turning off some even more expensive mitigations
that Xen does on behalf of guests.
I have a suspicion that these paths will be used under Xen PV, even if
only for dom0. The abstraction for host S3/4/5 are not great. That
said, at all points that guest code is executing, even after a logical
S3 resume, it will have a good GS_BASE (Assuming the teardown logic
doesn't self-clobber.)
The bigger issue with stuff accounting is that nothing AFAICT accounts
for the fact that any hypercall potentially empties the RSB in otherwise
synchronous program flow.
~Andrew
Powered by blists - more mailing lists