lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <640866c5-9fe0-4f7b-a459-7a685dbe4092@intel.com>
Date: Fri, 19 Apr 2024 13:32:14 +1200
From: "Huang, Kai" <kai.huang@...el.com>
To: Haitao Huang <haitao.huang@...ux.intel.com>, <jarkko@...nel.org>,
	<dave.hansen@...ux.intel.com>, <tj@...nel.org>, <mkoutny@...e.com>,
	<linux-kernel@...r.kernel.org>, <linux-sgx@...r.kernel.org>,
	<x86@...nel.org>, <cgroups@...r.kernel.org>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <hpa@...or.com>, <sohil.mehta@...el.com>,
	<tim.c.chen@...ux.intel.com>
CC: <zhiquan1.li@...el.com>, <kristen@...ux.intel.com>, <seanjc@...gle.com>,
	<zhanb@...rosoft.com>, <anakrish@...rosoft.com>,
	<mikko.ylinen@...ux.intel.com>, <yangjie@...rosoft.com>,
	<chrisyan@...rosoft.com>
Subject: Re: [PATCH v12 09/14] x86/sgx: Implement async reclamation for cgroup



On 16/04/2024 3:20 pm, Haitao Huang wrote:
> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
> 
> In cases EPC pages need be allocated during a page fault and the cgroup
> usage is near its limit, an asynchronous reclamation needs be triggered
> to avoid blocking the page fault handling.
> 
> Create a workqueue, corresponding work item and function definitions
> for EPC cgroup to support the asynchronous reclamation.
> 
> In case the workqueue allocation is failed during init, disable cgroup.

It's fine and reasonable to disable (SGX EPC) cgroup.  The problem is 
"exactly what does this mean" isn't quite clear.

Given SGX EPC is just one type of MISC cgroup resources, we cannot just 
disable MISC cgroup as a whole.

So, the first interpretation is we treat the entire MISC_CG_RES_SGX 
resource type doesn't exist, that is, we just don't show control files 
in the file system, and all EPC pages are tracked in the global list.

But it might be not straightforward to implement in the SGX driver, 
i.e., we might need to do more MISC cgroup core code change to make it 
being able to support disable particular resource at runtime -- I need 
to double check.

So if that is not something worth to do, we will still need to live with 
the fact that, the user is still able to create SGX cgroup in the 
hierarchy and see those control files, and being able to read/write them.

The second interpretation I suppose is, although the SGX cgroup is still 
seen as supported in userspace, in kernel we just treat it doesn't exist.

Specifically, that means: 1) we always return the root SGX cgroup for 
any EPC page when allocating a new one; 2) as a result, we still track 
all EPC pages in a single global list.

But from the code below ...


>   static int __sgx_cgroup_try_charge(struct sgx_cgroup *epc_cg)
>   {
>   	if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, PAGE_SIZE))
> @@ -117,19 +226,28 @@ int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim reclaim)
>   {
>   	int ret;
>   
> +	/* cgroup disabled due to wq allocation failure during sgx_cgroup_init(). */
> +	if (!sgx_cg_wq)
> +		return 0;
> +

.., IIUC you choose a (third) solution that is even one more step back:

It just makes try_charge() always succeed, but EPC pages are still 
managed in the "per-cgroup" list.

But this solution, AFAICT, doesn't work.  The reason is when you fail to 
allocate EPC page you will do the global reclaim, but now the global 
list is empty.

Am I missing anything?

So my thinking is, we have two options:

1) Modify the MISC cgroup core code to allow the kernel to disable one 
particular resource.  It shouldn't be hard, e.g., we can add a 
'disabled' flag to the 'struct misc_res'.

Hmm.. wait, after checking, the MISC cgroup won't show any control files 
if the "capacity" of the resource is 0:

"
  * Miscellaneous resources capacity for the entire machine. 0 capacity
  * means resource is not initialized or not present in the host.
"

So I really suppose we should go with this route, i.e., by just setting 
the EPC capacity to 0?

Note misc_cg_try_charge() will fail if capacity is 0, but we can make it 
return success by explicitly check whether SGX cgroup is disabled by 
using a helper, e.g., sgx_cgroup_disabled().

And you always return the root SGX cgroup in sgx_get_current_cg() when 
sgx_cgroup_disabled() is true.

And in sgx_reclaim_pages_global(), you do something like:

	static void sgx_reclaim_pages_global(..)
	{
	#ifdef CONFIG_CGROUP_MISC
		if (sgx_cgroup_disabled())
			sgx_reclaim_pages(&sgx_root_cg.lru);
		else
			sgx_cgroup_reclaim_pages(misc_cg_root());
	#else
		sgx_reclaim_pages(&sgx_global_list);
	#endif
	}

I am perhaps missing some other spots too but you got the idea.

At last, after typing those, I believe we should have a separate patch 
to handle disable SGX cgroup at initialization time.  And you can even 
put this patch _somewhere_ after the patch

	"x86/sgx: Implement basic EPC misc cgroup functionality"

and before this patch.

It makes sense to have such patch anyway, because with it we can easily 
to add a kernel command line 'sgx_cgroup=disabled" if the user wants it 
disabled (when someone has such requirement in the future).




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ