linux-kernel - Re: [PATCH v12 08/14] x86/sgx: Add basic EPC reclamation flow for cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a9ae08a-c813-442d-9fc3-031a4c984700@intel.com>
Date: Thu, 18 Apr 2024 11:51:28 +1200
From: "Huang, Kai" <kai.huang@...el.com>
To: Haitao Huang <haitao.huang@...ux.intel.com>, <jarkko@...nel.org>,
	<dave.hansen@...ux.intel.com>, <tj@...nel.org>, <mkoutny@...e.com>,
	<linux-kernel@...r.kernel.org>, <linux-sgx@...r.kernel.org>,
	<x86@...nel.org>, <cgroups@...r.kernel.org>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <hpa@...or.com>, <sohil.mehta@...el.com>,
	<tim.c.chen@...ux.intel.com>
CC: <zhiquan1.li@...el.com>, <kristen@...ux.intel.com>, <seanjc@...gle.com>,
	<zhanb@...rosoft.com>, <anakrish@...rosoft.com>,
	<mikko.ylinen@...ux.intel.com>, <yangjie@...rosoft.com>,
	<chrisyan@...rosoft.com>
Subject: Re: [PATCH v12 08/14] x86/sgx: Add basic EPC reclamation flow for
 cgroup



On 16/04/2024 3:20 pm, Haitao Huang wrote:
> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
> 
> Currently in the EPC page allocation, the kernel simply fails the
> allocation when the current EPC cgroup fails to charge due to its usage
> reaching limit.  This is not ideal. When that happens, a better way is
> to reclaim EPC page(s) from the current EPC cgroup (and/or its
> descendants) to reduce its usage so the new allocation can succeed.
> 
> Add the basic building blocks to support per-cgroup reclamation.
> 
> Currently the kernel only has one place to reclaim EPC pages: the global
> EPC LRU list.  To support the "per-cgroup" EPC reclaim, maintain an LRU
> list for each EPC cgroup, and introduce a "cgroup" variant function to
> reclaim EPC pages from a given EPC cgroup and its descendants.
> 
> Currently the kernel does the global EPC reclaim in sgx_reclaim_page().
> It always tries to reclaim EPC pages in batch of SGX_NR_TO_SCAN (16)
> pages.  Specifically, it always "scans", or "isolates" SGX_NR_TO_SCAN
> pages from the global LRU, and then tries to reclaim these pages at once
> for better performance.
> 
> Implement the "cgroup" variant EPC reclaim in a similar way, but keep
> the implementation simple: 1) change sgx_reclaim_pages() to take an LRU
> as input, and return the pages that are "scanned" and attempted for
> reclamation (but not necessarily reclaimed successfully); 2) loop the
> given EPC cgroup and its descendants and do the new sgx_reclaim_pages()
> until SGX_NR_TO_SCAN pages are "scanned".
> 
> This implementation, encapsulated in sgx_cgroup_reclaim_pages(), always
> tries to reclaim SGX_NR_TO_SCAN pages from the LRU of the given EPC
> cgroup, and only moves to its descendants when there's no enough
> reclaimable EPC pages to "scan" in its LRU.  It should be enough for
> most cases.
> 
> Note, this simple implementation doesn't _exactly_ mimic the current
> global EPC reclaim (which always tries to do the actual reclaim in batch
> of SGX_NR_TO_SCAN pages): when LRUs have less than SGX_NR_TO_SCAN
> reclaimable pages, the actual reclaim of EPC pages will be split into
> smaller batches _across_ multiple LRUs with each being smaller than
> SGX_NR_TO_SCAN pages.
> 
> A more precise way to mimic the current global EPC reclaim would be to
> have a new function to only "scan" (or "isolate") SGX_NR_TO_SCAN pages
> _across_ the given EPC cgroup _AND_ its descendants, and then do the
> actual reclaim in one batch.  But this is unnecessarily complicated at
> this stage.
> 
> Alternatively, the current sgx_reclaim_pages() could be changed to
> return the actual "reclaimed" pages, but not "scanned" pages. However,
> the reclamation is a lengthy process, forcing a successful reclamation
> of predetermined number of pages may block the caller for too long. And
> that may not be acceptable in some synchronous contexts, e.g., in
> serving an ioctl().
> 
> With this building block in place, add synchronous reclamation support
> in sgx_cgroup_try_charge(): trigger a call to
> sgx_cgroup_reclaim_pages() if the cgroup reaches its limit and the
> caller allows synchronous reclaim as indicated by s newly added
> parameter.
> 
> A later patch will add support for asynchronous reclamation reusing
> sgx_cgroup_reclaim_pages().
> 
> Note all reclaimable EPC pages are still tracked in the global LRU thus
> no per-cgroup reclamation is actually active at the moment. Per-cgroup
> tracking and reclamation will be turned on in the end after all
> necessary infrastructure is in place.

Nit:

"all necessary infrastructures are in place", or, "all necessary 
building blocks are in place".

?

> 
> Co-developed-by: Sean Christopherson <sean.j.christopherson@...el.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
> Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
> Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
> Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
> Tested-by: Jarkko Sakkinen <jarkko@...nel.org>
> ---

Reviewed-by: Kai Huang <kai.huang@...el.com>

More nitpickings below:

[...]

> -static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg)
> +static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim reclaim)

Let's still wrap the text on 80-character basis.

I guess most people are more used to that.

[...]

> -		epc_page = list_first_entry_or_null(&sgx_global_lru.reclaimable,
> -						    struct sgx_epc_page, list);
> +		epc_page = list_first_entry_or_null(&lru->reclaimable, struct sgx_epc_page, list);

Ditto.