[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtVv0mic9YUTpZO-@ziqianlu-kbl>
Date: Mon, 2 Sep 2024 15:57:06 +0800
From: Aaron Lu <aaron.lu@...el.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: Jarkko Sakkinen <jarkko@...nel.org>, Dave Hansen
<dave.hansen@...ux.intel.com>, <x86@...nel.org>, <linux-sgx@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Zhimin Luo <zhimin.luo@...el.com>, Kai Huang
<kai.huang@...el.com>
Subject: Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()
On Fri, Aug 30, 2024 at 07:03:33AM -0700, Dave Hansen wrote:
> On 8/29/24 23:02, Aaron Lu wrote:
> >> Also, I do think we should probably add some kind of sanity warning to
> >> the SGX code in another patch. If a node on an SGX system has CPUs and
> >> memory, it's very likely it will also have some EPC. It can be
> >> something soft like a pr_info(), but I think it would be nice to have.
> > I think there are systems with valid reason to not setup an EPC section
> > per node, e.g. a 8 sockets system with SNC=2, there would be a total of
> > 16 nodes and it's not possible to have one EPC section per node because
> > the upper limit of EPC sections is 8. I'm not sure a warning is
> > appropriate here, what do you think?
>
> While possible, those systems are pretty rare. I don't think a
> softly-worded pr_info() will scare anyone too much.
Understood.
Maybe something like below?
>From e49a78f27956b3d62c5ef962320e63dc3eeb897c Mon Sep 17 00:00:00 2001
From: Aaron Lu <aaron.lu@...el.com>
Date: Mon, 2 Sep 2024 11:46:07 +0800
Subject: [PATCH] x86/sgx: Log information when a node lacks an EPC section
For optimized performance, firmware typically distributes EPC sections
evenly across different NUMA nodes. However, there are scenarios where
a node may have both CPUs and memory but no EPC section configured. For
example, in an 8-socket system with a Sub-Numa-Cluster=2 setup, there
are a total of 16 nodes. Given that the maximum number of supported EPC
sections is 8, it is simply not feasible to assign one EPC section to
each node. This configuration is not incorrect - SGX will still operate
correctly; it is just not optimized from a NUMA standpoint.
For this reason, log a message when a node with both CPUs and memory
lacks an EPC section. This will provide users with a hint as to why they
might be experiencing less-than-ideal performance when running SGX
enclaves.
Suggested-by: Dave Hansen <dave.hansen@...el.com>
Signed-off-by: Aaron Lu <aaron.lu@...el.com>
---
arch/x86/kernel/cpu/sgx/main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 694fcf7a5e3a..3a79105455f1 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -848,6 +848,13 @@ static bool __init sgx_page_cache_init(void)
return false;
}
+ for_each_online_node(nid) {
+ if (!node_isset(nid, sgx_numa_mask) &&
+ node_state(nid, N_MEMORY) && node_state(nid, N_CPU))
+ pr_info("node%d has both CPUs and memory but doesn't have an EPC section\n",
+ nid);
+ }
+
return true;
}
--
2.45.2
Powered by blists - more mailing lists