[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aaaa54ed-7fd7-404c-853f-90f2e32ae004@intel.com>
Date: Fri, 23 Feb 2024 11:24:47 +1300
From: "Huang, Kai" <kai.huang@...el.com>
To: Haitao Huang <haitao.huang@...ux.intel.com>, "Mehta, Sohil"
<sohil.mehta@...el.com>, "mingo@...hat.com" <mingo@...hat.com>,
"jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
<hpa@...or.com>, "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, "mkoutny@...e.com"
<mkoutny@...e.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"tj@...nel.org" <tj@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>
CC: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for
cgroup
On 23/02/2024 9:12 am, Haitao Huang wrote:
> On Wed, 21 Feb 2024 04:48:58 -0600, Huang, Kai <kai.huang@...el.com> wrote:
>
>> On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
>>> StartHi Kai
>>> On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@...el.com>
>>> wrote:
>>> [...]
>>> >
>>> > So you introduced the work/workqueue here but there's no place which
>>> > actually
>>> > queues the work. IMHO you can either:
>>> >
>>> > 1) move relevant code change here; or
>>> > 2) focus on introducing core functions to reclaim certain pages from a
>>> > given EPC
>>> > cgroup w/o workqueue and introduce the work/workqueue in later patch.
>>> >
>>> > Makes sense?
>>> >
>>>
>>> Starting in v7, I was trying to split the big patch, #10 in v6 as you
>>> and
>>> others suggested. My thought process was to put infrastructure needed
>>> for
>>> per-cgroup reclaim in the front, then turn on per-cgroup reclaim in [v9
>>> 13/15] in the end.
>>
>> That's reasonable for sure.
>>
>
> Thanks for the confirmation :-)
>
>>>
>>> Before that, all reclaimables are tracked in the global LRU so really
>>> there is no "reclaim certain pages from a given EPC cgroup w/o
>>> workqueue"
>>> or reclaim through workqueue before that point, as suggested in #2. This
>>> patch puts down the implementation for both flows but neither used
>>> yet, as
>>> stated in the commit message.
>>
>> I know it's not used yet. The point is how to split patches to make
>> them more
>> self-contain and easy to review.
>
> I would think this patch already self-contained in that all are
> implementation of cgroup reclamation building blocks utilized later. But
> I'll try to follow your suggestions below to split further (would prefer
> not to merge in general unless there is strong reasons).
>
>>
>> For #2, sorry for not being explicit -- I meant it seems it's more
>> reasonable to
>> split in this way:
>>
>> Patch 1)
>> a). change to sgx_reclaim_pages();
>
> I'll still prefer this to be a separate patch. It is self-contained IMHO.
> We were splitting the original patch because it was too big. I don't
> want to merge back unless there is a strong reason.
>
>> b). introduce sgx_epc_cgroup_reclaim_pages();
>
> Ok.
If I got you right, I believe you want to have a cgroup variant function
following the same behaviour of the one for global reclaim, i.e., the
_current_ sgx_reclaim_pages(), which always tries to scan and reclaim
SGX_NR_TO_SCAN pages each time.
And this cgroup variant function, sgx_epc_cgroup_reclaim_pages(), tries
to scan and reclaim SGX_NR_TO_SCAN pages each time "_across_ the cgroup
and all the descendants".
And you want to implement sgx_epc_cgroup_reclaim_pages() in this way due
to WHATEVER reasons.
In that case, the change to sgx_reclaim_pages() and the introduce of
sgx_epc_cgroup_reclaim_pages() should really be together because they
are completely tied together in terms of implementation.
In this way you can just explain clearly in _ONE_ patch why you choose
this implementation, and for reviewer it's also easier to review because
we can just discuss in one patch.
Makes sense?
>
>> c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better
>> name), which just takes an EPC cgroup as input w/o involving any
>> work/workqueue.
>
> This is for the workqueue use only. So I think it'd be better be with
> patch #2 below?
There are multiple levels of logic here IMHO:
1. a) and b) above focus on "each reclaim" a given EPC cgroup
2. c) is about a loop of above to bring given cgroup's usage to limit
3. workqueue is one (probably best) way to do c) in async way
4. the logic where 1) (direct reclaim) and 3) (indirect) are triggered
To me, it's clear 1) should be in one patch as stated above.
Also, to me 3) and 4) are better to be together since they give you a
clear view on how the direct/indirect reclaim are triggered.
2) could be flexible depending on how you see it. If you prefer viewing
it from low-level implementation of reclaiming pages from cgroup, then
it's also OK to be together with 1). If you want to treat it as a part
of _async_ way of bring down usage to limit, then _MAYBE_ it's also OK
to be with 3) and 4).
But to me 2) can be together with 1) or even a separate patch because
it's still kinda of low-level reclaiming details. 3) and 4) shouldn't
contain such detail but should focus on how direct/indirect reclaim is done.
[...]
>
> To be honest, the part I'm feeling most confusing is this
> self-contained-ness. It seems depend on how you look at things.
Completely understand. But I think our discussion should be helpful to
both of us and others.
Powered by blists - more mailing lists