[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2jd9vpf0wjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Mon, 19 Feb 2024 09:12:51 -0600
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: dave.hansen@...ux.intel.com, tj@...nel.org, mkoutny@...e.com,
linux-kernel@...r.kernel.org, linux-sgx@...r.kernel.org, x86@...nel.org,
cgroups@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
hpa@...or.com, sohil.mehta@...el.com, tim.c.chen@...ux.intel.com, "Jarkko
Sakkinen" <jarkko@...nel.org>
Cc: zhiquan1.li@...el.com, kristen@...ux.intel.com, seanjc@...gle.com,
zhanb@...rosoft.com, anakrish@...rosoft.com, mikko.ylinen@...ux.intel.com,
yangjie@...rosoft.com, chrisyan@...rosoft.com
Subject: Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup
try_charge()
On Tue, 13 Feb 2024 19:52:25 -0600, Jarkko Sakkinen <jarkko@...nel.org>
wrote:
> On Tue Feb 13, 2024 at 1:15 AM EET, Haitao Huang wrote:
>> Hi Jarkko
>>
>> On Mon, 12 Feb 2024 13:55:46 -0600, Jarkko Sakkinen <jarkko@...nel.org>
>> wrote:
>>
>> > On Mon Feb 5, 2024 at 11:06 PM EET, Haitao Huang wrote:
>> >> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> >>
>> >> When the EPC usage of a cgroup is near its limit, the cgroup needs to
>> >> reclaim pages used in the same cgroup to make room for new
>> allocations.
>> >> This is analogous to the behavior that the global reclaimer is
>> triggered
>> >> when the global usage is close to total available EPC.
>> >>
>> >> Add a Boolean parameter for sgx_epc_cgroup_try_charge() to indicate
>> >> whether synchronous reclaim is allowed or not. And trigger the
>> >> synchronous/asynchronous reclamation flow accordingly.
>> >>
>> >> Note at this point, all reclaimable EPC pages are still tracked in
>> the
>> >> global LRU and per-cgroup LRUs are empty. So no per-cgroup
>> reclamation
>> >> is activated yet.
>> >>
>> >> Co-developed-by: Sean Christopherson
>> <sean.j.christopherson@...el.com>
>> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> >> Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> >> Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> >> Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> >> ---
>> >> V7:
>> >> - Split this out from the big patch, #10 in V6. (Dave, Kai)
>> >> ---
>> >> arch/x86/kernel/cpu/sgx/epc_cgroup.c | 26 ++++++++++++++++++++++++--
>> >> arch/x86/kernel/cpu/sgx/epc_cgroup.h | 4 ++--
>> >> arch/x86/kernel/cpu/sgx/main.c | 2 +-
>> >> 3 files changed, 27 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> >> b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> >> index d399fda2b55e..abf74fdb12b4 100644
>> >> --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> >> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> >> @@ -184,13 +184,35 @@ static void
>> >> sgx_epc_cgroup_reclaim_work_func(struct work_struct *work)
>> >> /**
>> >> * sgx_epc_cgroup_try_charge() - try to charge cgroup for a single
>> EPC
>> >> page
>> >> * @epc_cg: The EPC cgroup to be charged for the page.
>> >> + * @reclaim: Whether or not synchronous reclaim is allowed
>> >> * Return:
>> >> * * %0 - If successfully charged.
>> >> * * -errno - for failures.
>> >> */
>> >> -int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
>> >> +int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, bool
>> >> reclaim)
>> >> {
>> >> - return misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
>> PAGE_SIZE);
>> >> + for (;;) {
>> >> + if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
>> >> + PAGE_SIZE))
>> >> + break;
>> >> +
>> >> + if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
>> >> + return -ENOMEM;
>> >> + + if (signal_pending(current))
>> >> + return -ERESTARTSYS;
>> >> +
>> >> + if (!reclaim) {
>> >> + queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
>> >> + return -EBUSY;
>> >> + }
>> >> +
>> >> + if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, false))
>> >> + /* All pages were too young to reclaim, try again a little later
>> */
>> >> + schedule();
>> >
>> > This will be total pain to backtrack after a while when something
>> > needs to be changed so there definitely should be inline comments
>> > addressing each branch condition.
>> >
>> > I'd rethink this as:
>> >
>> > 1. Create static __sgx_epc_cgroup_try_charge() for addressing single
>> > iteration with the new "reclaim" parameter.
>> > 2. Add a new sgx_epc_group_try_charge_reclaim() function.
>> >
>> > There's a bit of redundancy with sgx_epc_cgroup_try_charge() and
>> > sgx_epc_cgroup_try_charge_reclaim() because both have almost the
>> > same loop calling internal __sgx_epc_cgroup_try_charge() with
>> > different parameters. That is totally acceptable.
>> >
>> > Please also add my suggested-by.
>> >
>> > BR, Jarkko
>> >
>> > BR, Jarkko
>> >
>> For #2:
>> The only caller of this function, sgx_alloc_epc_page(), has the same
>> boolean which is passed into this this function.
>
> I know. This would be good opportunity to fix that up. Large patch
> sets should try to make the space for its feature best possible and
> thus also clean up the code base overally.
>
>> If we separate it into sgx_epc_cgroup_try_charge() and
>> sgx_epc_cgroup_try_charge_reclaim(), then the caller has to have the
>> if/else branches. So separation here seems not help?
>
> Of course it does. It makes the code in that location self-documenting
> and easier to remember what it does.
>
> BR, Jarkko
>
Please let me know if this aligns with your suggestion.
static int ___sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
{
if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
PAGE_SIZE))
return 0;
if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
return -ENOMEM;
if (signal_pending(current))
return -ERESTARTSYS;
return -EBUSY;
}
/**
* sgx_epc_cgroup_try_charge() - try to charge cgroup for a single page
* @epc_cg: The EPC cgroup to be charged for the page.
*
* Try to reclaim pages in the background if the group reaches its limit
and
* there are reclaimable pages in the group.
* Return:
* * %0 - If successfully charged.
* * -errno - for failures.
*/
int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
{
int ret = ___sgx_epc_cgroup_try_charge(epc_cg);
if (ret == -EBUSY)
queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
return ret;
}
/**
* sgx_epc_cgroup_try_charge_reclaim() - try to charge cgroup for a single
page
* @epc_cg: The EPC cgroup to be charged for the page.
*
* Try to reclaim pages directly if the group reaches its limit and there
are
* reclaimable pages in the group.
* Return:
* * %0 - If successfully charged.
* * -errno - for failures.
*/
int sgx_epc_cgroup_try_charge_reclaim(struct sgx_epc_cgroup *epc_cg)
{
int ret;
for (;;) {
ret = ___sgx_epc_cgroup_try_charge(epc_cg);
if (ret != -EBUSY)
return ret;
if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, current->mm))
/* All pages were too young to reclaim, try again
a little later */
schedule();
}
return 0;
}
It is a little more involved to remove the boolean for
sgx_alloc_epc_page() and its callers like sgx_encl_grow(),
sgx_alloc_va_page(). I'll send a separate patch for comments.
Thanks
Haitao
Powered by blists - more mailing lists