[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2lbh28nuwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Wed, 27 Mar 2024 19:24:34 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "Mehta, Sohil" <sohil.mehta@...el.com>, "mingo@...hat.com"
<mingo@...hat.com>, "jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org"
<x86@...nel.org>, "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
<hpa@...or.com>, "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, "mkoutny@...e.com"
<mkoutny@...e.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "tj@...nel.org"
<tj@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>, "Huang, Kai"
<kai.huang@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for
cgroup
On Thu, 22 Feb 2024 16:24:47 -0600, Huang, Kai <kai.huang@...el.com> wrote:
>
>
> On 23/02/2024 9:12 am, Haitao Huang wrote:
>> On Wed, 21 Feb 2024 04:48:58 -0600, Huang, Kai <kai.huang@...el.com>
>> wrote:
>>
>>> On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
>>>> StartHi Kai
>>>> On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@...el.com>
>>>> wrote:
>>>> [...]
>>>> >
>>>> > So you introduced the work/workqueue here but there's no place which
>>>> > actually
>>>> > queues the work. IMHO you can either:
>>>> >
>>>> > 1) move relevant code change here; or
>>>> > 2) focus on introducing core functions to reclaim certain pages
>>>> from a
>>>> > given EPC
>>>> > cgroup w/o workqueue and introduce the work/workqueue in later
>>>> patch.
>>>> >
>>>> > Makes sense?
>>>> >
>>>>
>>>> Starting in v7, I was trying to split the big patch, #10 in v6 as you
>>>> and
>>>> others suggested. My thought process was to put infrastructure needed
>>>> for
>>>> per-cgroup reclaim in the front, then turn on per-cgroup reclaim in
>>>> [v9
>>>> 13/15] in the end.
>>>
>>> That's reasonable for sure.
>>>
>> Thanks for the confirmation :-)
>>
>>>>
>>>> Before that, all reclaimables are tracked in the global LRU so really
>>>> there is no "reclaim certain pages from a given EPC cgroup w/o
>>>> workqueue"
>>>> or reclaim through workqueue before that point, as suggested in #2.
>>>> This
>>>> patch puts down the implementation for both flows but neither used
>>>> yet, as
>>>> stated in the commit message.
>>>
>>> I know it's not used yet. The point is how to split patches to make
>>> them more
>>> self-contain and easy to review.
>> I would think this patch already self-contained in that all are
>> implementation of cgroup reclamation building blocks utilized later.
>> But I'll try to follow your suggestions below to split further (would
>> prefer not to merge in general unless there is strong reasons).
>>
>>>
>>> For #2, sorry for not being explicit -- I meant it seems it's more
>>> reasonable to
>>> split in this way:
>>>
>>> Patch 1)
>>> a). change to sgx_reclaim_pages();
>> I'll still prefer this to be a separate patch. It is self-contained
>> IMHO.
>> We were splitting the original patch because it was too big. I don't
>> want to merge back unless there is a strong reason.
>>
>>> b). introduce sgx_epc_cgroup_reclaim_pages();
>> Ok.
>
> If I got you right, I believe you want to have a cgroup variant function
> following the same behaviour of the one for global reclaim, i.e., the
> _current_ sgx_reclaim_pages(), which always tries to scan and reclaim
> SGX_NR_TO_SCAN pages each time.
>
> And this cgroup variant function, sgx_epc_cgroup_reclaim_pages(), tries
> to scan and reclaim SGX_NR_TO_SCAN pages each time "_across_ the cgroup
> and all the descendants".
>
> And you want to implement sgx_epc_cgroup_reclaim_pages() in this way due
> to WHATEVER reasons.
>
> In that case, the change to sgx_reclaim_pages() and the introduce of
> sgx_epc_cgroup_reclaim_pages() should really be together because they
> are completely tied together in terms of implementation.
>
> In this way you can just explain clearly in _ONE_ patch why you choose
> this implementation, and for reviewer it's also easier to review because
> we can just discuss in one patch.
>
> Makes sense?
>
>>
>>> c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better
>>> name), which just takes an EPC cgroup as input w/o involving any
>>> work/workqueue.
>> This is for the workqueue use only. So I think it'd be better be with
>> patch #2 below?
>
> There are multiple levels of logic here IMHO:
>
> 1. a) and b) above focus on "each reclaim" a given EPC cgroup
> 2. c) is about a loop of above to bring given cgroup's usage to limit
> 3. workqueue is one (probably best) way to do c) in async way
> 4. the logic where 1) (direct reclaim) and 3) (indirect) are triggered
>
> To me, it's clear 1) should be in one patch as stated above.
>
> Also, to me 3) and 4) are better to be together since they give you a
> clear view on how the direct/indirect reclaim are triggered.
>
> 2) could be flexible depending on how you see it. If you prefer viewing
> it from low-level implementation of reclaiming pages from cgroup, then
> it's also OK to be together with 1). If you want to treat it as a part
> of _async_ way of bring down usage to limit, then _MAYBE_ it's also OK
> to be with 3) and 4).
>
> But to me 2) can be together with 1) or even a separate patch because
> it's still kinda of low-level reclaiming details. 3) and 4) shouldn't
> contain such detail but should focus on how direct/indirect reclaim is
> done.
I incorporated most of your suggestions, and think it'd be better discuss
this with actual code.
So I'm sending out v10, and just quickly summarize what I did to address
this particular issue here.
I pretty much follow above suggestions and end up with two patches:
1) a) and b) above plus direct reclaim triggered in try_charge() so
reviewers can see at lease one use of the sgx_cgroup_reclaim_pages(),
which is the basic building block.
2) All async related: c) above, workqueue, indirect triggered in
try_charge() which queues the work.
Please review v10 and if you think the triggering parts need be separated
then I'll separate.
Additionally, after more experimentation, I simplified sgx_reclaim_pages()
by removing the pointer for *nr_to_scan as you suggested, but returning
pages collected for isolation (attempted for reclaim) instead of pages
actually reclaimed. I found performance is acceptable with this approach.
Thanks again for your review.
Haitao
Powered by blists - more mailing lists