[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2cztslnpwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Tue, 17 Oct 2023 23:37:23 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: Michal Koutný <mkoutny@...e.com>
Cc: "Christopherson,, Sean" <seanjc@...gle.com>,
"Huang, Kai" <kai.huang@...el.com>,
"Zhang, Bo" <zhanb@...rosoft.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
"yangjie@...rosoft.com" <yangjie@...rosoft.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"Li, Zhiquan1" <zhiquan1.li@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"tj@...nel.org" <tj@...nel.org>,
"anakrish@...rosoft.com" <anakrish@...rosoft.com>,
"jarkko@...nel.org" <jarkko@...nel.org>,
"hpa@...or.com" <hpa@...or.com>,
"mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"Mehta, Sohil" <sohil.mehta@...el.com>,
"bp@...en8.de" <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim
EPC
Hi Michal,
On Tue, 17 Oct 2023 13:54:46 -0500, Michal Koutný <mkoutny@...e.com> wrote:
> Hello Haitao.
>
> On Tue, Oct 17, 2023 at 07:58:02AM -0500, Haitao Huang
> <haitao.huang@...ux.intel.com> wrote:
>> AFAIK, before we introducing max_write() callback in this series, no
>> misc
>> controller would possibly enforce the limit when misc.max is reduced.
>> e.g. I
>> don't think CVMs be killed when ASID limit is reduced and the cgroup was
>> full before limit is reduced.
>
> Yes, misccontroller was meant to be simple, current >= max serves to
> prevent new allocations.
>
Thanks for confirming. Maybe another alternative we just keep max_write
non-preemptive. No need to add max_write() callback.
The EPC controller only triggers reclaiming on new allocations or return
NOMEM if no more to reclaim. Reclaiming here includes normal EPC page
reclaiming and killing enclaves in out of EPC cases. vEPCs assigned to
guests are basically carved out and never reclaimable by the host.
As we no longer enforce limits on max_write a lower value, user should not
expect cgroup to force reclaim pages from enclave or kill VMs/enclaves as
a result of reducing limits 'in-place'. User should always create cgroups,
set limits, launch enclave/VM into the groups created.
> FTR, at some point in time memory.max was considered for reclaim control
> of regular pages but it turned out to be too coarse (and OOM killing
> processes if amount was not sensed correctly) and this eventually
> evolved into specific mechanism of memory.reclaim.
> So I'm mentioning this should that be an interface with better semantic
> for your use case (and misc.max writes can remain non-preemptive).
>
Yes we can introduce misc.reclaim to give user a knob to forcefully
reducing usage if
that is really needed in real usage. The semantics would make force-kill
VMs explicit to user.
> One more note -- I was quite confused when I read in the rest of the
> series about OOM and _kill_ing but then I found no such measure in the
> code implementation. So I would suggest two terminological changes:
>
> - the basic premise of the series (00/18) is that EPC pages are a
> different resource than memory, hence choose a better suiting name
> than OOM (out of memory) condition,
I couldn't come up a good name. Out of EPC (OOEPC) maybe? I feel OOEPC
would be hard to read in code though. OOM was relatable as it is similar
to normal OOM but special kind of memory :-) I'm open to any better
suggestions.
> - killing -- (unless you have an intention to implement process
> termination later) My current interpretation that it is rather some
> aggressive unmapping within address space, so less confusing name for
> that would be "reclaim".
>
yes. Killing here refers to killing enclave, analogous to killing process,
not just 'reclaim' though. I can change to always use 'killing enclave'
explicitly.
>
>> I think EPC pages to VMs could have the same behavior, once they are
>> given
>> to a guest, never taken back by the host. For enclaves on host side,
>> pages
>> are reclaimable, that allows us to enforce in a similar way to memcg.
>
> Is this distinction between preemptability of EPC pages mandated by the
> HW implementation? (host/"process" enclaves vs VM enclaves) Or do have
> users an option to lock certain pages in memory that yields this
> difference?
>
The difference is really a result of current vEPC implementation. Because
enclave pages once in use contains confidential content, they need special
process to reclaim. So it's complex to implement host reclaiming guest EPCs
gracefully.
Thanks
Haitao
Powered by blists - more mailing lists