[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4219d88f-5c09-408f-b72d-1685367072f0@amd.com>
Date: Mon, 27 Nov 2023 16:10:34 -0600
From: Tom Lendacky <thomas.lendacky@....com>
To: Ashwin Dayanand Kamat <kashwindayan@...are.com>,
linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
jroedel@...e.de, brijesh.singh@....com, ganb@...are.com,
tkundu@...are.com, vsirnapalli@...are.com, akaher@...are.com,
amakhalov@...are.com
Subject: Re: [PATCH] x86/sev: Update ghcb_version only once
On 11/6/23 00:32, Ashwin Dayanand Kamat wrote:
> kernel crash was observed because of page fault, while running
> cpuhotplug ltp testcases on SEV-ES enabled systems. The crash was
> observed during hotplug after the CPU was offlined and the process
> was migrated to different cpu. setup_ghcb() is called again which
> tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
> is a read_only variable which is initialised during booting.
> This results in pagefault.
>
> From logs,
> [ 256.447466] BUG: unable to handle page fault for address: ffffffffba556e70
> [ 256.447476] #PF: supervisor write access in kernel mode
> [ 256.447478] #PF: error_code(0x0003) - permissions violation
> [ 256.447479] PGD 8000667c0f067 P4D 8000667c0f067 PUD 8000667c10063 PMD 80080006674001e1
> [ 256.447483] Oops: 0003 [#1] PREEMPT SMP NOPTI
> [ 256.447487] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.1.45-8.ph5 #1-photon
> .
> .
> .
> .
> .
> [ 256.447511] CR2: ffffffffba556e70 CR3: 0008000667c0a004 CR4: 0000000000770ee0
> [ 256.447514] PKRU: 55555554
> [ 256.447515] Call Trace:
> [ 256.447516] <TASK>
> [ 256.447519] ? __die_body.cold+0x1a/0x1f
> [ 256.447526] ? __die+0x2a/0x35
> [ 256.447528] ? page_fault_oops+0x10c/0x270
> [ 256.447531] ? setup_ghcb+0x71/0x100
> [ 256.447533] ? __x86_return_thunk+0x5/0x6
> [ 256.447537] ? search_exception_tables+0x60/0x70
> [ 256.447541] ? __x86_return_thunk+0x5/0x6
> [ 256.447543] ? fixup_exception+0x27/0x320
> [ 256.447546] ? kernelmode_fixup_or_oops+0xa2/0x120
> [ 256.447549] ? __bad_area_nosemaphore+0x16a/0x1b0
> [ 256.447551] ? kernel_exc_vmm_communication+0x60/0xb0
> [ 256.447556] ? bad_area_nosemaphore+0x16/0x20
> [ 256.447558] ? do_kern_addr_fault+0x7a/0x90
> [ 256.447560] ? exc_page_fault+0xbd/0x160
> [ 256.447563] ? asm_exc_page_fault+0x27/0x30
> [ 256.447570] ? setup_ghcb+0x71/0x100
> [ 256.447572] ? setup_ghcb+0xe/0x100
> [ 256.447574] cpu_init_exception_handling+0x1b9/0x1f0
>
> Fix is to avoid updating the variable after it has been initialised during booting.
The call to sev_es_negotiate_protocol() could be moved down to after the
initial_vc_handler if-check in setup_ghcb(). That would then put the call
to sev_es_negotiate_protocol() only in the BSP boot phase (and it only
needs be done once). Does doing that prevent the #PF for you?
>
> Fixes: 95d33bfaa3e1 ("x86/sev: Register GHCB memory when SEV-SNP is active")
> Signed-off-by: Ashwin Dayanand Kamat <kashwindayan@...are.com>
> Co-developed-by: Bo Gan <ganb@...are.com>
This tag needs to be moved above your Signed-off-by: and it needs a
Signed-off-by: for the co-developer.
Thanks,
Tom
> ---
> arch/x86/kernel/sev-shared.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index ccb0915e84e1..a447908f2b4d 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -144,6 +144,9 @@ static bool sev_es_negotiate_protocol(void)
> {
> u64 val;
>
> + if (ghcb_version)
> + return true;
> +
> /* Do the GHCB protocol version negotiation */
> sev_es_wr_ghcb_msr(GHCB_MSR_SEV_INFO_REQ);
> VMGEXIT();
Powered by blists - more mailing lists