[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZZgVi7rHAAzyzbz5@himmelriiki>
Date: Fri, 5 Jan 2024 16:43:23 +0200
From: Mikko Ylinen <mikko.ylinen@...ux.intel.com>
To: Haitao Huang <haitao.huang@...ux.intel.com>
Cc: "Mehta, Sohil" <sohil.mehta@...el.com>,
"jarkko@...nel.org" <jarkko@...nel.org>,
"x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
"hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"tj@...nel.org" <tj@...nel.org>,
"mkoutny@...e.com" <mkoutny@...e.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bp@...en8.de" <bp@...en8.de>, "Huang, Kai" <kai.huang@...el.com>,
Dave Hansen <dave.hansen@...el.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>,
"anakrish@...rosoft.com" <anakrish@...rosoft.com>,
"sean.j.christopherson@...el.com" <sean.j.christopherson@...el.com>,
"Li, Zhiquan1" <zhiquan1.li@...el.com>,
"yangjie@...rosoft.com" <yangjie@...rosoft.com>
Subject: Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim
function
On Thu, Jan 04, 2024 at 01:11:15PM -0600, Haitao Huang wrote:
> Hi Dave,
>
> On Wed, 03 Jan 2024 10:37:35 -0600, Dave Hansen <dave.hansen@...el.com>
> wrote:
>
> > On 12/18/23 13:24, Haitao Huang wrote:> @Dave and @Michal, Your
> > thoughts? Or could you confirm we should not
> > > do reclaim per cgroup at all?
> > What's the benefit of doing reclaim per cgroup? Is that worth the extra
> > complexity?
> >
>
> Without reclaiming per cgroup, then we have to always set the limit to
> enclave's peak usage. This may not be efficient utilization as in many cases
> each enclave can perform fine with EPC limit set less than peak. Basically
> each group can not give up some pages for greater good without dying :-)
+1. this is exactly my thinking too. The per cgroup reclaiming is
important for the containers use case we are working on. I also think
it makes the limit more meaningful: the per-container pool of EPC pages
to use (which is independent of the enclave size).
>
> Also with enclaves enabled with EDMM, the peak usage is not static so hard
> to determine upfront. Hence it might be an operation/deployment
> inconvenience.
>
> In case of over-committing (sum of limits > total capacity), one cgroup at
> peak usage may require swapping pages out in a different cgroup if system is
> overloaded at that time.
>
> > The key question here is whether we want the SGX VM to be complex and
> > more like the real VM or simple when a cgroup hits its limit. Right?
> >
>
> Although it's fair to say the majority of complexity of this series is in
> support for reclaiming per cgroup, I think it's manageable and much less
> than real VM after we removed the enclave killing parts: the only extra
> effort is to track pages in separate list and reclaim them in separately as
> opposed to track in on global list and reclaim together. The main reclaiming
> loop code is still pretty much the same as before.
>
>
> > If stopping at patch 5 and having less code is even remotely an option,
> > why not do _that_?
> >
> I hope I described limitations clear enough above.
> If those are OK with users and also make it acceptable for merge quickly,
You explained the gaps very well already. I don't think the simple
version without per-cgroup reclaiming is enough for the container case.
Mikko
Powered by blists - more mailing lists