[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2mf3ykfswjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Thu, 18 Apr 2024 17:41:19 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "hpa@...or.com" <hpa@...or.com>, "tim.c.chen@...ux.intel.com"
<tim.c.chen@...ux.intel.com>, "linux-sgx@...r.kernel.org"
<linux-sgx@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"jarkko@...nel.org" <jarkko@...nel.org>, "cgroups@...r.kernel.org"
<cgroups@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mkoutny@...e.com" <mkoutny@...e.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "Mehta, Sohil"
<sohil.mehta@...el.com>, "tj@...nel.org" <tj@...nel.org>, "mingo@...hat.com"
<mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>, "Huang, Kai"
<kai.huang@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v12 05/14] x86/sgx: Implement basic EPC misc cgroup
functionality
On Tue, 16 Apr 2024 08:22:06 -0500, Huang, Kai <kai.huang@...el.com> wrote:
> On Mon, 2024-04-15 at 20:20 -0700, Haitao Huang wrote:
>> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
>>
>> SGX Enclave Page Cache (EPC) memory allocations are separate from normal
>> RAM allocations, and are managed solely by the SGX subsystem. The
>> existing cgroup memory controller cannot be used to limit or account for
>> SGX EPC memory, which is a desirable feature in some environments. For
>> instance, within a Kubernetes environment, while a user may specify a
>> particular EPC quota for a pod, the orchestrator requires a mechanism to
>> enforce that the pod's actual runtime EPC usage does not exceed the
>> allocated quota.
>>
>> Utilize the misc controller [admin-guide/cgroup-v2.rst, 5-9. Misc] to
>> limit and track EPC allocations per cgroup. Earlier patches have added
>> the "sgx_epc" resource type in the misc cgroup subsystem. Add basic
>> support in SGX driver as the "sgx_epc" resource provider:
>>
>> - Set "capacity" of EPC by calling misc_cg_set_capacity()
>> - Update EPC usage counter, "current", by calling charge and uncharge
>> APIs for EPC allocation and deallocation, respectively.
>> - Setup sgx_epc resource type specific callbacks, which perform
>> initialization and cleanup during cgroup allocation and deallocation,
>> respectively.
>>
>> With these changes, the misc cgroup controller enables users to set a
>> hard
>> limit for EPC usage in the "misc.max" interface file. It reports current
>> usage in "misc.current", the total EPC memory available in
>> "misc.capacity", and the number of times EPC usage reached the max limit
>> in "misc.events".
>>
>> For now, the EPC cgroup simply blocks additional EPC allocation in
>> sgx_alloc_epc_page() when the limit is reached. Reclaimable pages are
>> still tracked in the global active list, only reclaimed by the global
>> reclaimer when the total free page count is lower than a threshold.
>>
>> Later patches will reorganize the tracking and reclamation code in the
>> global reclaimer and implement per-cgroup tracking and reclaiming.
>>
>> Co-developed-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> Reviewed-by: Jarkko Sakkinen <jarkko@...nel.org>
>> Reviewed-by: Tejun Heo <tj@...nel.org>
>> Tested-by: Jarkko Sakkinen <jarkko@...nel.org>
>
> I don't see any big issue, so feel free to add:
>
> Reviewed-by: Kai Huang <kai.huang@...el.com>
>
Thanks
> Nitpickings below:
>
> [...]
>
>
>> --- /dev/null
>> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> @@ -0,0 +1,72 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright(c) 2022-2024 Intel Corporation. */
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/kernel.h>
>
> It doesn't seem you need the above two here.
>
> Probably they are needed in later patches, in that case we can move to
> the
> relevant patch(es) that they got used.
>
> However I think it's better to explicitly include <linux/slab.h> since
> kzalloc()/kfree() are used.
>
> Btw, I am not sure whether you want to use <linux/kernel.h> because looks
> it contains a lot of unrelated staff. Anyway I guess nobody cares.
>
I'll check and remove as needed.
>> +#include "epc_cgroup.h"
>> +
>> +/* The root SGX EPC cgroup */
>> +static struct sgx_cgroup sgx_cg_root;
>
> The comment isn't necessary (sorry didn't notice before), because the
> code
> is pretty clear saying that IMHO.
>
Was requested by Jarkko:
https://lore.kernel.org/lkml/CYU504RLY7QU.QZY9LWC076NX@suppilovahvero/#t
> [...]
>
>>
>> --- /dev/null
>> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
>> @@ -0,0 +1,72 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _SGX_EPC_CGROUP_H_
>> +#define _SGX_EPC_CGROUP_H_
>> +
>> +#include <asm/sgx.h>
>
> I don't see why you need <asm/sgx.h> here. Also, ...
>
>> +#include <linux/cgroup.h>
>> +#include <linux/misc_cgroup.h>
>> +
>> +#include "sgx.h"
>
> ... "sgx.h" already includes <asm/sgx.h>
>
> [...]
>
right
>>
>> +static inline struct sgx_cgroup *sgx_get_current_cg(void)
>> +{
>> + /* get_current_misc_cg() never returns NULL when Kconfig enabled */
>> + return sgx_cgroup_from_misc_cg(get_current_misc_cg());
>> +}
>
> I spent some time looking into this. And yes if I was reading code
> correctly the get_current_misc_cg() should never return NULL when Kconfig
> is on.
>
> I typed my analysis below in [*]. And it would be helpful if any cgroup
> expert can have a second eye on this.
>
> [...]
>
Thanks for checking this and I did similar and agree with the conclusion.
I think this is confirmed also by Michal's description AFAICT:
"
The current implementation creates root css object (see cgroup_init(),
cgroup_ssid_enabled() check is after cgroup_init_subsys()).
I.e. it will look like all tasks are members of root cgroup wrt given
controller permanently and controller attribute files won't exist."
>
>> --- a/arch/x86/kernel/cpu/sgx/main.c
>> +++ b/arch/x86/kernel/cpu/sgx/main.c
>> @@ -6,6 +6,7 @@
>> #include <linux/highmem.h>
>> #include <linux/kthread.h>
>> #include <linux/miscdevice.h>
>> +#include <linux/misc_cgroup.h>
>
> Is this needed? I believe SGX variants in "epc_cgroup.h" should be
> enough
> for sgx/main.c?
>
> [...]
>
>
right
> [*] IIUC get_current_misc_cg() should never return NULL when Kconfig is
> on
yes
[...]
Thanks
Haitao
Powered by blists - more mailing lists