[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf21f955c1b56ef836ad03bc42d522b6d020edbf.camel@intel.com>
Date: Wed, 21 Feb 2024 10:48:58 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "Mehta, Sohil" <sohil.mehta@...el.com>, "mingo@...hat.com"
<mingo@...hat.com>, "jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org"
<x86@...nel.org>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "cgroups@...r.kernel.org"
<cgroups@...r.kernel.org>, "hpa@...or.com" <hpa@...or.com>,
"tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, "mkoutny@...e.com"
<mkoutny@...e.com>, "haitao.huang@...ux.intel.com"
<haitao.huang@...ux.intel.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"tj@...nel.org" <tj@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>
CC: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for
cgroup
On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
> StartHi Kai
> On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@...el.com> wrote:
> [...]
> >
> > So you introduced the work/workqueue here but there's no place which
> > actually
> > queues the work. IMHO you can either:
> >
> > 1) move relevant code change here; or
> > 2) focus on introducing core functions to reclaim certain pages from a
> > given EPC
> > cgroup w/o workqueue and introduce the work/workqueue in later patch.
> >
> > Makes sense?
> >
>
> Starting in v7, I was trying to split the big patch, #10 in v6 as you and
> others suggested. My thought process was to put infrastructure needed for
> per-cgroup reclaim in the front, then turn on per-cgroup reclaim in [v9
> 13/15] in the end.
That's reasonable for sure.
>
> Before that, all reclaimables are tracked in the global LRU so really
> there is no "reclaim certain pages from a given EPC cgroup w/o workqueue"
> or reclaim through workqueue before that point, as suggested in #2. This
> patch puts down the implementation for both flows but neither used yet, as
> stated in the commit message.
I know it's not used yet. The point is how to split patches to make them more
self-contain and easy to review.
For #2, sorry for not being explicit -- I meant it seems it's more reasonable to
split in this way:
Patch 1)
a). change to sgx_reclaim_pages();
b). introduce sgx_epc_cgroup_reclaim_pages();
c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better name),
which just takes an EPC cgroup as input w/o involving any work/workqueue.
These functions are all related to how to implement reclaiming pages from a
given EPC cgroup, and they are logically related in terms of implementation thus
it's easier to be reviewed together.
Then you just need to justify the design/implementation in changelog/comments.
Patch 2)
- Introduce work/workqueue, and implement the logic to queue the work.
Now we all know there's a function to reclaim pages for a given EPC cgroup, then
we can focus on when that is called, either directly or indirectly.
>
> #1 would force me go back and merge the patches again.
I don't think so. I am not asking to put all things together, but only asking
to split in better way (that I see).
You mentioned some function is "Scheduled by sgx_epc_cgroup_try_charge() to
reclaim pages", but I am not seeing any code doing that in this patch. This
needs fixing, either by moving relevant code here, or removing these not-done-
yet comments.
For instance (I am just giving an example), if after review we found the
queue_work() shouldn't be done in try_charge(), you will need to go back to this
patch and remove these comments.
That's not the best way. Each patch needs to be self-contained.
>
> Sorry I feel kind of lost on this whole thing by now. It seems so random
> to me. Is there hard rules on this?
See above. I am only offering my opinion on how to split patches in better way.
>
> I was hoping these statements would help reviewers on the flow of the
> patches.
>
> At the end of [v9 04/15]:
>
> For now, the EPC cgroup simply blocks additional EPC allocation in
> sgx_alloc_epc_page() when the limit is reached. Reclaimable pages are
> still tracked in the global active list, only reclaimed by the global
> reclaimer when the total free page count is lower than a threshold.
>
> Later patches will reorganize the tracking and reclamation code in the
> global reclaimer and implement per-cgroup tracking and reclaiming.
>
> At the end of [v9 06/15]:
>
> Next patches will first get the cgroup reclamation flow ready while
> keeping pages tracked in the global LRU and reclaimed by ksgxd before we
> make the switch in the end for sgx_lru_list() to return per-cgroup
> LRU.
>
> At the end of [v9 08/15]:
>
> Both synchronous and asynchronous flows invoke the same top level reclaim
> function, and will be triggered later by sgx_epc_cgroup_try_charge()
> when usage of the cgroup is at or near its limit.
>
> At the end of [v9 10/15]:
> Note at this point, all reclaimable EPC pages are still tracked in the
> global LRU and per-cgroup LRUs are empty. So no per-cgroup reclamation
> is activated yet.
They are useful in the changelog in each patch I suppose, but to me we are
discussing different things.
I found one pain in the review is I have to jump back and forth many times among
multiple patches to see whether one patch is reasonable. That's why I am asking
whether there's better way to split patches so that each patch can be self-
contained logically in someway and easier to review.
Powered by blists - more mailing lists