[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4e8fbe757860cd24e2f66b25be60d76663935d8.camel@kernel.org>
Date: Thu, 20 Jan 2022 15:01:29 +0200
From: Jarkko Sakkinen <jarkko@...nel.org>
To: Reinette Chatre <reinette.chatre@...el.com>,
dave.hansen@...ux.intel.com, tglx@...utronix.de, bp@...en8.de,
luto@...nel.org, mingo@...hat.com, linux-sgx@...r.kernel.org,
x86@...nel.org
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH] x86/sgx: Silence softlockup detection when releasing
large enclaves
On Tue, 2022-01-18 at 11:14 -0800, Reinette Chatre wrote:
> Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
> triggers the softlockup detector.
>
> Actual SGX systems have 128GB of enclave memory or more. The
> "unclobbered_vdso_oversubscribed" selftest creates one enclave which
> consumes all of the enclave memory on the system. Tearing down such a
> large enclave takes around a minute, most of it in the loop where
> the EREMOVE instruction is applied to each individual 4k enclave
> page.
>
> Spending one minute in a loop triggers the softlockup detector.
>
> Add a cond_resched() to give other tasks a chance to run and placate
> the softlockup detector.
>
> Cc: stable@...r.kernel.org
> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> Reported-by: Vijay Dhanraj <vijay.dhanraj@...el.com>
> Acked-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@...el.com>
> ---
> Softlockup message:
> watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
> Kernel panic - not syncing: softlockup: hung tasks
> <snip>
> sgx_encl_release+0x86/0x1c0
> sgx_release+0x11c/0x130
> __fput+0xb0/0x280
> ____fput+0xe/0x10
> task_work_run+0x6c/0xc0
> exit_to_user_mode_prepare+0x1eb/0x1f0
> syscall_exit_to_user_mode+0x1d/0x50
> do_syscall_64+0x46/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> arch/x86/kernel/cpu/sgx/encl.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c
> b/arch/x86/kernel/cpu/sgx/encl.c
> index 001808e3901c..ab2b79327a8a 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -410,6 +410,7 @@ void sgx_encl_release(struct kref *ref)
> }
>
> kfree(entry);
> + cond_resched();
> }
>
> xa_destroy(&encl->page_array);
I'd add a comment, e.g.
/* Invoke scheduler to prevent soft lockups. */
Other than that makes sense.
BR, Jarkko
Powered by blists - more mailing lists