[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <103f18636f0d65e3bcb0ca5f1008c0c7df0bdfd7.camel@intel.com>
Date: Thu, 20 Jun 2024 10:30:16 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "chenridong@...wei.com" <chenridong@...wei.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "mkoutny@...e.com"
<mkoutny@...e.com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "haitao.huang@...ux.intel.com"
<haitao.huang@...ux.intel.com>, "tim.c.chen@...ux.intel.com"
<tim.c.chen@...ux.intel.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "tj@...nel.org" <tj@...nel.org>,
"jarkko@...nel.org" <jarkko@...nel.org>, "Mehta, Sohil"
<sohil.mehta@...el.com>, "hpa@...or.com" <hpa@...or.com>, "bp@...en8.de"
<bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>
CC: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v15 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
On 18/06/2024 12:53 am, Huang, Haitao wrote:
> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
>
> Previous patches have implemented all infrastructure needed for
> per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC
> pages are still tracked in the global LRU as sgx_epc_page_lru() always
> returns reference to the global LRU.
>
> Change sgx_epc_page_lru() to return the LRU of the cgroup in which the
> given EPC page is allocated.
>
> This makes all EPC pages tracked in per-cgroup LRUs and the global
> reclaimer (ksgxd) will not be able to reclaim any pages from the global
> LRU. However, in cases of over-committing, i.e., the sum of cgroup
> limits greater than the total capacity, cgroups may never reclaim but
> the total usage can still be near the capacity. Therefore a global
> reclamation is still needed in those cases and it should be performed
> from the root cgroup.
>
> Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup
> when cgroup is enabled. Similar to sgx_cgroup_reclaim_pages(), return
> the next cgroup so callers can use it as the new starting node for next
> round of reclamation if needed.
>
> Also update sgx_can_reclaim_global(), to check emptiness of LRUs of all
> cgroups when EPC cgroup is enabled, otherwise only check the global LRU.
>
> Finally, change sgx_reclaim_direct(), to check and ensure there are free
> pages at cgroup level so forward progress can be made by the caller.
Reading above, it's not clear how the _new_ global reclaim works with
multiple LRUs.
E.g., the current global reclaim essentially treats all EPC pages equally
when scanning those pages. From the above, I don't see how this is
achieved in the new global reclaim.
The changelog should:
1) describe the how does existing global reclaim work, and then describe
how to achieve the same beahviour in the new global reclaim which works
with multiple LRUs;
2) If there's any behaviour difference between the "existing" vs the "new"
global reclaim, the changelog should point out the difference, and explain
why such difference is OK.
>
> With these changes, the global reclamation and per-cgroup reclamation
> both work properly with all pages tracked in per-cgroup LRUs.
>
[...]
>
> -static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
> +static struct misc_cg *sgx_reclaim_pages_global(struct misc_cg *next_cg,
> + struct mm_struct *charge_mm)
> {
> + if (IS_ENABLED(CONFIG_CGROUP_MISC))
> + return sgx_cgroup_reclaim_pages(misc_cg_root(), next_cg, charge_mm);
> +
> sgx_reclaim_pages(&sgx_global_lru, charge_mm);
> + return NULL;
> }
>
> /*
> @@ -414,12 +443,35 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
> */
> void sgx_reclaim_direct(void)
> {
> + struct sgx_cgroup *sgx_cg = sgx_get_current_cg();
> + struct misc_cg *cg = misc_from_sgx(sgx_cg);
From below @sgx_cg could be NULL. It's not immediately clear whether calling
misc_from_sgx(sgx_cg) unconditionally is safe here.
Leave the initiaization of @cg to a later phase where @sgx_cg is
guaranteed not being NULL, or initialize @cg to NULL here and update later.
> + struct misc_cg *next_cg = NULL;
> +
> + /*
> + * Make sure there are some free pages at both cgroup and global levels.
> + * In both cases, only make one attempt of reclamation to avoid lengthy
> + * block on the caller.
> + */
> + if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg))
> + next_cg = sgx_cgroup_reclaim_pages(cg, next_cg, current->mm);
I don't quite follow the logic.
First of all, sgx_cgroup_reclaim_pages() isn't called in a loop, so why
not just do:
next_cg = sgx_cgroup_reclaim_pages(cg, NULL, current->mm);
And what is the point of set @next_cg here, since ...
> +
> + if (next_cg != cg)
> + put_misc_cg(next_cg);
> +
> + next_cg = NULL;
... here @next_cg is reset to NULL ?
Looks the only reason is you need to do ...
put_misc_cg(next_cg);
... above?
This piece of code appears repeatedly in this file. Is there any way we
can get rid of it?
> if (sgx_should_reclaim_global(SGX_NR_LOW_PAGES))
> - sgx_reclaim_pages_global(current->mm);
> + next_cg = sgx_reclaim_pages_global(next_cg, current->mm);
And this doesn't seems "global reclaim" at all?
Because it essentially equals to:
next_cg = sgx_reclaim_pages_global(NULL, current->mm);
which always reclaims from the ROOT.
So each call to sgx_reclaim_direct() will always reclaim from the ROOT --
any other LRUs in the hierarchy will unlikely to get any chance to be
reclaimed.
> +
> + if (next_cg != misc_cg_root())
> + put_misc_cg(next_cg);
> +
> + sgx_put_cg(sgx_cg);
> }
>
> static int ksgxd(void *p)
> {
> + struct misc_cg *next_cg = NULL;
> +
> set_freezable();
>
> /*
> @@ -437,11 +489,15 @@ static int ksgxd(void *p)
> kthread_should_stop() ||
> sgx_should_reclaim_global(SGX_NR_HIGH_PAGES));
>
> - if (sgx_should_reclaim_global(SGX_NR_HIGH_PAGES))
> + while (!kthread_should_stop() && sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)) {
> /* Indirect reclaim, no mm to charge, so NULL: */
> - sgx_reclaim_pages_global(NULL);
> + next_cg = sgx_reclaim_pages_global(next_cg, NULL);
> + cond_resched();
Should the 'put_misc_cg(next_cg)' be done within the while() loop but not
below?
> + }
>
> - cond_resched();
> + if (next_cg != misc_cg_root())
> + put_misc_cg(next_cg);
> + next_cg = NULL;
Again, it doesn't seems "global reclaim" here, since you always restart
from the ROOT once the target pages have been reclaimed.
AFAICT it's completely different from the existing global reclaim.
> }
>
> return 0;
> @@ -583,6 +639,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
> */
> struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
> {
> + struct misc_cg *next_cg = NULL;
> struct sgx_cgroup *sgx_cg;
> struct sgx_epc_page *page;
> int ret;
> @@ -616,10 +673,19 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
> break;
> }
>
> - sgx_reclaim_pages_global(current->mm);
> + /*
> + * At this point, the usage within this cgroup is under its
> + * limit but there is no physical page left for allocation.
> + * Perform a global reclaim to get some pages released from any
> + * cgroup with reclaimable pages.
> + */
> + next_cg = sgx_reclaim_pages_global(next_cg, current->mm);
> cond_resched();
> }
Ditto IIUC.
Powered by blists - more mailing lists