[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <baa1b55b-8415-0c33-a6e5-f568c48f455f@suse.com>
Date: Wed, 11 Jan 2023 12:52:41 +0100
From: Juergen Gross <jgross@...e.com>
To: Peter Zijlstra <peterz@...radead.org>,
Joan Bruguera <joanbrugueram@...il.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Cooper <Andrew.Cooper3@...rix.com>,
"Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: Wake-up from suspend to RAM broken under `retbleed=stuff`
On 11.01.23 12:20, Peter Zijlstra wrote:
> On Mon, Jan 09, 2023 at 04:05:31AM +0000, Joan Bruguera wrote:
>> This fixes wakeup for me on both QEMU and real HW
>> (just a proof of concept, don't merge)
>>
>> diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c
>> index ffea98f9064b..8704bcc0ce32 100644
>> --- a/arch/x86/kernel/callthunks.c
>> +++ b/arch/x86/kernel/callthunks.c
>> @@ -7,6 +7,7 @@
>> #include <linux/memory.h>
>> #include <linux/moduleloader.h>
>> #include <linux/static_call.h>
>> +#include <linux/suspend.h>
>>
>> #include <asm/alternative.h>
>> #include <asm/asm-offsets.h>
>> @@ -150,6 +151,10 @@ static bool skip_addr(void *dest)
>> if (dest >= (void *)hypercall_page &&
>> dest < (void*)hypercall_page + PAGE_SIZE)
>> return true;
>> +#endif
>> +#ifdef CONFIG_PM_SLEEP
>> + if (dest == restore_processor_state)
>> + return true;
>> #endif
>> return false;
>> }
>> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
>> index 236447ee9beb..e667894936f7 100644
>> --- a/arch/x86/power/cpu.c
>> +++ b/arch/x86/power/cpu.c
>> @@ -281,6 +281,9 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
>> /* Needed by apm.c */
>> void notrace restore_processor_state(void)
>> {
>> + /* Restore GS before calling anything to avoid crash on call depth accounting */
>> + native_wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
>> +
>> __restore_processor_state(&saved_context);
>> }
>
> Yeah, I can see why, but I'm not really comfortable with this. TBH, I
> don't see how the whole resume code is correct to begin with. At the
> very least it needs a heavy dose of noinstr.
>
> Rafael, what cr3 is active when we call restore_processor_state()?
>
> Specifically, the problem is that I don't feel comfortable doing any
> sort of weird code until all the CR and segment registers have been
> restored, however, write_cr*() are paravirt functions that result in
> CALL, which then gives us a bit of a checken and egg problem.
Under Xen the hypervisor should do all needed low level stuff when
resuming. Even dom0 is just a guest, so I don't see a problem with
PV guests. CR and segment registers should be in the vcpu structure(s)
in Xen and they should be restored by Xen when activating the vcpus
after suspend. There shouldn't be any need to do this again in the
kernel.
I have verified that the suspend path when running as Xen PV guest
is NOT calling restore_processor_state().
> I'm also wondering how well retbleed=stuff works on Xen, if at all. If
> we can ignore Xen, things are a little earier perhaps.
In restore_processor_state() you should be able to ignore Xen, unless
there are other code paths calling it (kexec can be ignored, too, as
that isn't supported in Xen PV, and suspend to disk shouldn't work
either).
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists