linux-kernel - Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2lbh28nuwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Wed, 27 Mar 2024 19:24:34 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "Mehta, Sohil" <sohil.mehta@...el.com>, "mingo@...hat.com"
 <mingo@...hat.com>, "jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org"
 <x86@...nel.org>, "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
 "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
 <hpa@...or.com>, "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
 "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, "mkoutny@...e.com"
 <mkoutny@...e.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "tj@...nel.org"
 <tj@...nel.org>, "linux-kernel@...r.kernel.org"
 <linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>, "Huang, Kai"
 <kai.huang@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
 "seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
 <anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
 "kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
 <yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
 "chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for
 cgroup

On Thu, 22 Feb 2024 16:24:47 -0600, Huang, Kai <kai.huang@...el.com> wrote:

>
>
> On 23/02/2024 9:12 am, Haitao Huang wrote:
>> On Wed, 21 Feb 2024 04:48:58 -0600, Huang, Kai <kai.huang@...el.com>  
>> wrote:
>>
>>> On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
>>>> StartHi Kai
>>>> On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@...el.com>  
>>>> wrote:
>>>> [...]
>>>> >
>>>> > So you introduced the work/workqueue here but there's no place which
>>>> > actually
>>>> > queues the work.  IMHO you can either:
>>>> >
>>>> > 1) move relevant code change here; or
>>>> > 2) focus on introducing core functions to reclaim certain pages  
>>>> from a
>>>> > given EPC
>>>> > cgroup w/o workqueue and introduce the work/workqueue in later  
>>>> patch.
>>>> >
>>>> > Makes sense?
>>>> >
>>>>
>>>> Starting in v7, I was trying to split the big patch, #10 in v6 as you  
>>>> and
>>>> others suggested. My thought process was to put infrastructure needed  
>>>> for
>>>> per-cgroup reclaim in the front, then turn on per-cgroup reclaim in  
>>>> [v9
>>>> 13/15] in the end.
>>>
>>> That's reasonable for sure.
>>>
>>  Thanks for the confirmation :-)
>>
>>>>
>>>> Before that, all reclaimables are tracked in the global LRU so really
>>>> there is no "reclaim certain pages from a  given EPC cgroup w/o  
>>>> workqueue"
>>>> or reclaim through workqueue before that point, as suggested in #2.  
>>>> This
>>>> patch puts down the implementation for both flows but neither used  
>>>> yet, as
>>>> stated in the commit message.
>>>
>>> I know it's not used yet.  The point is how to split patches to make  
>>> them more
>>> self-contain and easy to review.
>>  I would think this patch already self-contained in that all are  
>> implementation of cgroup reclamation building blocks utilized later.  
>> But I'll try to follow your suggestions below to split further (would  
>> prefer not to merge in general unless there is strong reasons).
>>
>>>
>>> For #2, sorry for not being explicit -- I meant it seems it's more  
>>> reasonable to
>>> split in this way:
>>>
>>> Patch 1)
>>>   a). change to sgx_reclaim_pages();
>>  I'll still prefer this to be a separate patch. It is self-contained  
>> IMHO.
>> We were splitting the original patch because it was too big. I don't  
>> want to merge back unless there is a strong reason.
>>
>>>   b). introduce sgx_epc_cgroup_reclaim_pages();
>>  Ok.
>
> If I got you right, I believe you want to have a cgroup variant function  
> following the same behaviour of the one for global reclaim, i.e., the  
> _current_ sgx_reclaim_pages(), which always tries to scan and reclaim  
> SGX_NR_TO_SCAN pages each time.
>
> And this cgroup variant function, sgx_epc_cgroup_reclaim_pages(), tries  
> to scan and reclaim SGX_NR_TO_SCAN pages each time "_across_ the cgroup  
> and all the descendants".
>
> And you want to implement sgx_epc_cgroup_reclaim_pages() in this way due  
> to WHATEVER reasons.
>
> In that case, the change to sgx_reclaim_pages() and the introduce of  
> sgx_epc_cgroup_reclaim_pages() should really be together because they  
> are completely tied together in terms of implementation.
>
> In this way you can just explain clearly in _ONE_ patch why you choose  
> this implementation, and for reviewer it's also easier to review because  
> we can just discuss in one patch.
>
> Makes sense?
>
>>
>>>   c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better  
>>> name),     which just takes an EPC cgroup as input w/o involving any  
>>> work/workqueue.
>>  This is for the workqueue use only. So I think it'd be better be with  
>> patch #2 below?
>
> There are multiple levels of logic here IMHO:
>
>    1. a) and b) above focus on "each reclaim" a given EPC cgroup
>    2. c) is about a loop of above to bring given cgroup's usage to limit
>    3. workqueue is one (probably best) way to do c) in async way
>    4. the logic where 1) (direct reclaim) and 3) (indirect) are triggered
>
> To me, it's clear 1) should be in one patch as stated above.
>
> Also, to me 3) and 4) are better to be together since they give you a  
> clear view on how the direct/indirect reclaim are triggered.
>
> 2) could be flexible depending on how you see it.  If you prefer viewing  
> it from low-level implementation of reclaiming pages from cgroup, then  
> it's also OK to be together with 1).  If you want to treat it as a part  
> of _async_ way of bring down usage to limit, then _MAYBE_ it's also OK  
> to be with 3) and 4).
>
> But to me 2) can be together with 1) or even a separate patch because  
> it's still kinda of low-level reclaiming details.  3) and 4) shouldn't  
> contain such detail but should focus on how direct/indirect reclaim is  
> done.

I incorporated most of your suggestions, and think it'd be better discuss  
this with actual code.

So I'm sending out v10, and just quickly summarize what I did to address  
this particular issue here.

I pretty much follow above suggestions and end up with two patches:

1) a) and b) above  plus direct reclaim triggered in try_charge() so  
reviewers can see at lease one use of the sgx_cgroup_reclaim_pages(),  
which is the basic building block.

2) All async related: c) above, workqueue, indirect triggered in  
try_charge() which queues the work.

Please review v10 and if you think the triggering parts need be separated  
then I'll separate.

Additionally, after more experimentation, I simplified sgx_reclaim_pages()  
by removing the pointer for *nr_to_scan as you suggested, but returning  
pages collected for isolation (attempted for reclaim) instead of pages  
actually reclaimed. I found performance is acceptable with this approach.

Thanks again for your review.
Haitao