[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4be309656cb4e03793703098bbebab3dee93077e.camel@intel.com>
Date: Fri, 19 Apr 2024 22:44:59 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "hpa@...or.com" <hpa@...or.com>, "tim.c.chen@...ux.intel.com"
<tim.c.chen@...ux.intel.com>, "linux-sgx@...r.kernel.org"
<linux-sgx@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"jarkko@...nel.org" <jarkko@...nel.org>, "cgroups@...r.kernel.org"
<cgroups@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mkoutny@...e.com" <mkoutny@...e.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "haitao.huang@...ux.intel.com"
<haitao.huang@...ux.intel.com>, "Mehta, Sohil" <sohil.mehta@...el.com>,
"tj@...nel.org" <tj@...nel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>
CC: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v12 09/14] x86/sgx: Implement async reclamation for cgroup
On Fri, 2024-04-19 at 13:55 -0500, Haitao Huang wrote:
> On Thu, 18 Apr 2024 20:32:14 -0500, Huang, Kai <kai.huang@...el.com> wrote:
>
> >
> >
> > On 16/04/2024 3:20 pm, Haitao Huang wrote:
> > > From: Kristen Carlson Accardi <kristen@...ux.intel.com>
> > > In cases EPC pages need be allocated during a page fault and the cgroup
> > > usage is near its limit, an asynchronous reclamation needs be triggered
> > > to avoid blocking the page fault handling.
> > > Create a workqueue, corresponding work item and function definitions
> > > for EPC cgroup to support the asynchronous reclamation.
> > > In case the workqueue allocation is failed during init, disable cgroup.
> >
> > It's fine and reasonable to disable (SGX EPC) cgroup. The problem is
> > "exactly what does this mean" isn't quite clear.
> >
> First, this is really some corner case most people don't care: during
> init, kernel can't even allocate a workqueue object. So I don't think we
> should write extra code to implement some sophisticated solution. Any
> solution we come up with may just not work as the way user want or solve
> the real issue due to the fact such allocation failure even happens at
> init time.
I think for such boot time failure we can either choose directly BUG_ON(),
or we try to handle it _nicely_, but not half-way. My experience is
adding BUG_ON() should be avoided in general, but it might be acceptable
during kernel boot. I will leave it to others.
[...]
> >
> > ..., IIUC you choose a (third) solution that is even one more step back:
> >
> > It just makes try_charge() always succeed, but EPC pages are still
> > managed in the "per-cgroup" list.
> >
> > But this solution, AFAICT, doesn't work. The reason is when you fail to
> > allocate EPC page you will do the global reclaim, but now the global
> > list is empty.
> >
> > Am I missing anything?
>
> But when cgroups enabled in config, global reclamation starts from root
> and reclaim from the whole hierarchy if user may still be able to create.
> Just that we don't have async/sync per-cgroup reclaim triggered.
OK. I missed this as it is in a later patch.
>
> >
> > So my thinking is, we have two options:
> >
> > 1) Modify the MISC cgroup core code to allow the kernel to disable one
> > particular resource. It shouldn't be hard, e.g., we can add a
> > 'disabled' flag to the 'struct misc_res'.
> >
> > Hmm.. wait, after checking, the MISC cgroup won't show any control files
> > if the "capacity" of the resource is 0:
> >
> > "
> > * Miscellaneous resources capacity for the entire machine. 0 capacity
> > * means resource is not initialized or not present in the host.
> > "
> >
> > So I really suppose we should go with this route, i.e., by just setting
> > the EPC capacity to 0?
> >
> > Note misc_cg_try_charge() will fail if capacity is 0, but we can make it
> > return success by explicitly check whether SGX cgroup is disabled by
> > using a helper, e.g., sgx_cgroup_disabled().
> >
> > And you always return the root SGX cgroup in sgx_get_current_cg() when
> > sgx_cgroup_disabled() is true.
> >
> > And in sgx_reclaim_pages_global(), you do something like:
> >
> > static void sgx_reclaim_pages_global(..)
> > {
> > #ifdef CONFIG_CGROUP_MISC
> > if (sgx_cgroup_disabled())
> > sgx_reclaim_pages(&sgx_root_cg.lru);
> > else
> > sgx_cgroup_reclaim_pages(misc_cg_root());
> > #else
> > sgx_reclaim_pages(&sgx_global_list);
> > #endif
> > }
> >
> > I am perhaps missing some other spots too but you got the idea.
> >
> > At last, after typing those, I believe we should have a separate patch
> > to handle disable SGX cgroup at initialization time. And you can even
> > put this patch _somewhere_ after the patch
> >
> > "x86/sgx: Implement basic EPC misc cgroup functionality"
> >
> > and before this patch.
> >
> > It makes sense to have such patch anyway, because with it we can easily
> > to add a kernel command line 'sgx_cgroup=disabled" if the user wants it
> > disabled (when someone has such requirement in the future).
> >
>
> I think we can add support for "sgx_cgroup=disabled" in future if indeed
> needed. But just for init failure, no?
>
It's not about the commandline, which we can add in the future when
needed. It's about we need to have a way to handle SGX cgroup being
disabled at boot time nicely, because we already have a case where we need
to do so.
Your approach looks half-way to me, and is not future extendible. If we
choose to do it, do it right -- that is, we need a way to disable it
completely in both kernel and userspace so that userspace won't be able to
see it.
Powered by blists - more mailing lists