linux-kernel - Re: [PATCH v12 08/14] x86/sgx: Add basic EPC reclamation flow for cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2moue7exwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Tue, 23 Apr 2024 10:53:21 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: jarkko@...nel.org, dave.hansen@...ux.intel.com, tj@...nel.org,
 mkoutny@...e.com, linux-kernel@...r.kernel.org, linux-sgx@...r.kernel.org,
 x86@...nel.org, cgroups@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
 bp@...en8.de, hpa@...or.com, sohil.mehta@...el.com,
 tim.c.chen@...ux.intel.com, "Huang, Kai" <kai.huang@...el.com>
Cc: zhiquan1.li@...el.com, kristen@...ux.intel.com, seanjc@...gle.com,
 zhanb@...rosoft.com, anakrish@...rosoft.com, mikko.ylinen@...ux.intel.com,
 yangjie@...rosoft.com, chrisyan@...rosoft.com
Subject: Re: [PATCH v12 08/14] x86/sgx: Add basic EPC reclamation flow for
 cgroup

On Wed, 17 Apr 2024 18:51:28 -0500, Huang, Kai <kai.huang@...el.com> wrote:

>
>
> On 16/04/2024 3:20 pm, Haitao Huang wrote:
>> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
>>  Currently in the EPC page allocation, the kernel simply fails the
>> allocation when the current EPC cgroup fails to charge due to its usage
>> reaching limit.  This is not ideal. When that happens, a better way is
>> to reclaim EPC page(s) from the current EPC cgroup (and/or its
>> descendants) to reduce its usage so the new allocation can succeed.
>>  Add the basic building blocks to support per-cgroup reclamation.
>>  Currently the kernel only has one place to reclaim EPC pages: the  
>> global
>> EPC LRU list.  To support the "per-cgroup" EPC reclaim, maintain an LRU
>> list for each EPC cgroup, and introduce a "cgroup" variant function to
>> reclaim EPC pages from a given EPC cgroup and its descendants.
>>  Currently the kernel does the global EPC reclaim in sgx_reclaim_page().
>> It always tries to reclaim EPC pages in batch of SGX_NR_TO_SCAN (16)
>> pages.  Specifically, it always "scans", or "isolates" SGX_NR_TO_SCAN
>> pages from the global LRU, and then tries to reclaim these pages at once
>> for better performance.
>>  Implement the "cgroup" variant EPC reclaim in a similar way, but keep
>> the implementation simple: 1) change sgx_reclaim_pages() to take an LRU
>> as input, and return the pages that are "scanned" and attempted for
>> reclamation (but not necessarily reclaimed successfully); 2) loop the
>> given EPC cgroup and its descendants and do the new sgx_reclaim_pages()
>> until SGX_NR_TO_SCAN pages are "scanned".
>>  This implementation, encapsulated in sgx_cgroup_reclaim_pages(), always
>> tries to reclaim SGX_NR_TO_SCAN pages from the LRU of the given EPC
>> cgroup, and only moves to its descendants when there's no enough
>> reclaimable EPC pages to "scan" in its LRU.  It should be enough for
>> most cases.
>>  Note, this simple implementation doesn't _exactly_ mimic the current
>> global EPC reclaim (which always tries to do the actual reclaim in batch
>> of SGX_NR_TO_SCAN pages): when LRUs have less than SGX_NR_TO_SCAN
>> reclaimable pages, the actual reclaim of EPC pages will be split into
>> smaller batches _across_ multiple LRUs with each being smaller than
>> SGX_NR_TO_SCAN pages.
>>  A more precise way to mimic the current global EPC reclaim would be to
>> have a new function to only "scan" (or "isolate") SGX_NR_TO_SCAN pages
>> _across_ the given EPC cgroup _AND_ its descendants, and then do the
>> actual reclaim in one batch.  But this is unnecessarily complicated at
>> this stage.
>>  Alternatively, the current sgx_reclaim_pages() could be changed to
>> return the actual "reclaimed" pages, but not "scanned" pages. However,
>> the reclamation is a lengthy process, forcing a successful reclamation
>> of predetermined number of pages may block the caller for too long. And
>> that may not be acceptable in some synchronous contexts, e.g., in
>> serving an ioctl().
>>  With this building block in place, add synchronous reclamation support
>> in sgx_cgroup_try_charge(): trigger a call to
>> sgx_cgroup_reclaim_pages() if the cgroup reaches its limit and the
>> caller allows synchronous reclaim as indicated by s newly added
>> parameter.
>>  A later patch will add support for asynchronous reclamation reusing
>> sgx_cgroup_reclaim_pages().
>>  Note all reclaimable EPC pages are still tracked in the global LRU thus
>> no per-cgroup reclamation is actually active at the moment. Per-cgroup
>> tracking and reclamation will be turned on in the end after all
>> necessary infrastructure is in place.
>
> Nit:
>
> "all necessary infrastructures are in place", or, "all necessary  
> building blocks are in place".
>
> ?
>
>>  Co-developed-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> Tested-by: Jarkko Sakkinen <jarkko@...nel.org>
>> ---
>
> Reviewed-by: Kai Huang <kai.huang@...el.com>
>

Thanks

> More nitpickings below:
>
> [...]
>
>> -static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg)
>> +static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg,  
>> enum sgx_reclaim reclaim)
>
> Let's still wrap the text on 80-character basis.
>
> I guess most people are more used to that.
>
> [...]
>
>> -		epc_page = list_first_entry_or_null(&sgx_global_lru.reclaimable,
>> -						    struct sgx_epc_page, list);
>> +		epc_page = list_first_entry_or_null(&lru->reclaimable, struct  
>> sgx_epc_page, list);
>
> Ditto.
>

Actually I changed to 100 char width based on comments from Jarkko IIRC.
I don't have personal preference, but will not change back to 80 unless  
Jarkko also agrees.

Thanks
Haitao