[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2ee67d2d1540c1eca1545a4121a4f9b34036e06.camel@intel.com>
Date: Tue, 18 Jun 2024 23:15:37 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "chenridong@...wei.com" <chenridong@...wei.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "mkoutny@...e.com"
<mkoutny@...e.com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "haitao.huang@...ux.intel.com"
<haitao.huang@...ux.intel.com>, "tim.c.chen@...ux.intel.com"
<tim.c.chen@...ux.intel.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "tj@...nel.org" <tj@...nel.org>,
"jarkko@...nel.org" <jarkko@...nel.org>, "Mehta, Sohil"
<sohil.mehta@...el.com>, "hpa@...or.com" <hpa@...or.com>, "bp@...en8.de"
<bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>
CC: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v15 05/14] x86/sgx: Implement basic EPC misc cgroup
functionality
On Tue, 2024-06-18 at 07:56 -0500, Haitao Huang wrote:
> On Tue, 18 Jun 2024 06:31:09 -0500, Huang, Kai <kai.huang@...el.com> wrote:
>
> >
> > > @@ -921,7 +956,8 @@ static int __init sgx_init(void)
> > > if (!sgx_page_cache_init())
> > > return -ENOMEM;
> > >
> > > - if (!sgx_page_reclaimer_init()) {
> > > + if (!sgx_page_reclaimer_init() || !sgx_cgroup_init()) {
> > > + misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
> > > ret = -ENOMEM;
> > > goto err_page_cache;
> > > }
> >
> > This code change is wrong due to two reasons:
> >
> > 1) If sgx_page_reclaimer_init() was successful, but sgx_cgroup_init()
> > failed, you actually need to 'goto err_kthread' because the ksgxd()
> > kernel
> > thread is already created and is running.
> >
> > 2) There are other cases after here that can also result in sgx_init() to
> > fail completely, e.g., registering sgx_dev_provision mics device. We
> > need
> > to reset the capacity to 0 for those cases as well.
> >
> > AFAICT, you need something like:
> >
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c
> > b/arch/x86/kernel/cpu/sgx/main.c
> > index 27892e57c4ef..46f9c26992a7 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -930,6 +930,10 @@ static int __init sgx_init(void)
> > if (ret)
> > goto err_kthread;
> > + ret = sgx_cgroup_init();
> > + if (ret)
> > + goto err_provision;
> > +
> > /*
> > * Always try to initialize the native *and* KVM drivers.
> > * The KVM driver is less picky than the native one and
> > @@ -941,10 +945,12 @@ static int __init sgx_init(void)
> > ret = sgx_drv_init();
> > if (sgx_vepc_init() && ret)
> > - goto err_provision;
> > + goto err_cgroup;
> > return 0;
> > +err_cgroup:
> > + /* SGX EPC cgroup cleanup */
> > err_provision:
> > misc_deregister(&sgx_dev_provision);
> > @@ -952,6 +958,8 @@ static int __init sgx_init(void)
> > kthread_stop(ksgxd_tsk);
> > err_page_cache:
> > + misc_misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
> > +
> > for (i = 0; i < sgx_nr_epc_sections; i++) {
> > vfree(sgx_epc_sections[i].pages);
> > memunmap(sgx_epc_sections[i].virt_addr);
> >
> >
> > I put the sgx_cgroup_init() before sgx_drv_init() and sgx_vepc_init(),
> > otherwise you will need sgx_drv_cleanup() and sgx_vepc_cleanup()
> > respectively when sgx_cgroup_init() fails.
> >
>
> Yes, good catch.
>
> > This looks a little bit weird too, though:
> >
> > Calling misc_misc_cg_set_capacity() to reset capacity to 0 is done at end
> > of sgx_init() error path, because the "set capacity" part is done in
> > sgx_epc_cache_init().
> > But logically, both "set capacity" and "reset capacity to 0" should be
> > SGX
> > EPC cgroup operation, so it's more reasonable to do "set capacity" in
> > sgx_cgroup_init() and do "reset to 0" in the
> >
> > /* SGX EPC cgroup cleanup */
> >
> > as shown above.
> >
> > Eventually, you will need to do EPC cgroup cleanup anyway, e.g., to free
> > the workqueue, so it's odd to have two places to handle EPC cgroup
> > cleanup.
> >
> > I understand the reason "set capacity" part is done in
> > sgx_page_cache_init() now is because in that function you can easily get
> > the capacity. But the fact is @sgx_numa_nodes also tracks EPC size for
> > each node, so you can also get the total EPC size from @sgx_numa_node in
> > sgx_cgroup_init() and set capacity there.
> >
> > In this case, you can put "reset capacity to 0" and "free workqueue"
> > together as the "SGX EPC cgroup cleanup", which is way more clear IMHO.
> >
> Okay, will expose @sgx_numa_nodes to epc_cgroup.c and do the calculations
> in sgx_cgroup_init().
>
Looks you will also need to expose @sgx_numa_mask, which looks overkill.
Other options:
1) Expose a function to return total EPC pages/size in "sgx.h".
2) Move out the new 'capacity' variable in this patch as a global variable
and expose it in "sgx.h" (perhaps rename to 'sgx_total_epc_pages/size').
3) Make sgx_cgroup_init() to take an argument of total EPC pages/size, and
pass it in sgx_init().
For 3) there are also options to get total EPC pages/size:
a) Move out the new 'capacity' variable in this patch as a static.
b) Add a function to calculate total EPC pages/size from sgx_numa_nodes.
Hmm.. I think we can just use option 2)?
Powered by blists - more mailing lists