linux-kernel - Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <op.2cztslnpwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date:   Tue, 17 Oct 2023 23:37:23 -0500
From:   "Haitao Huang" <haitao.huang@...ux.intel.com>
To:     Michal Koutný <mkoutny@...e.com>
Cc:     "Christopherson,, Sean" <seanjc@...gle.com>,
        "Huang, Kai" <kai.huang@...el.com>,
        "Zhang, Bo" <zhanb@...rosoft.com>,
        "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "yangjie@...rosoft.com" <yangjie@...rosoft.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "Li, Zhiquan1" <zhiquan1.li@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "tj@...nel.org" <tj@...nel.org>,
        "anakrish@...rosoft.com" <anakrish@...rosoft.com>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "hpa@...or.com" <hpa@...or.com>,
        "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
        "Mehta, Sohil" <sohil.mehta@...el.com>,
        "bp@...en8.de" <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>,
        "kristen@...ux.intel.com" <kristen@...ux.intel.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim
 EPC

Hi Michal,

On Tue, 17 Oct 2023 13:54:46 -0500, Michal Koutný <mkoutny@...e.com> wrote:

> Hello Haitao.
>
> On Tue, Oct 17, 2023 at 07:58:02AM -0500, Haitao Huang  
> <haitao.huang@...ux.intel.com> wrote:
>> AFAIK, before we introducing max_write() callback in this series, no  
>> misc
>> controller would possibly enforce the limit when misc.max is reduced.  
>> e.g. I
>> don't think CVMs be killed when ASID limit is reduced and the cgroup was
>> full before limit is reduced.
>
> Yes, misccontroller was meant to be simple, current >= max serves to
> prevent new allocations.
>
Thanks for confirming. Maybe another alternative we just keep max_write
non-preemptive. No need to add max_write() callback.

The EPC controller only triggers reclaiming on new allocations or return
NOMEM if no more to reclaim. Reclaiming here includes normal EPC page  
reclaiming and killing enclaves in out of EPC cases. vEPCs assigned to  
guests are basically carved out and never reclaimable by the host.

As we no longer enforce limits on max_write a lower value, user should not  
expect cgroup to force reclaim pages from enclave or kill VMs/enclaves as  
a result of reducing limits 'in-place'. User should always create cgroups,  
set limits, launch enclave/VM into the groups created.

> FTR, at some point in time memory.max was considered for reclaim control
> of regular pages but it turned out to be too coarse (and OOM killing
> processes if amount was not sensed correctly) and this eventually
> evolved into specific mechanism of memory.reclaim.
> So I'm mentioning this should that be an interface with better semantic
> for your use case (and misc.max writes can remain non-preemptive).
>

Yes we can introduce misc.reclaim to give user a knob to forcefully  
reducing usage if
that is really needed in real usage. The semantics would make force-kill  
VMs explicit to user.

> One more note -- I was quite confused when I read in the rest of the
> series about OOM and _kill_ing but then I found no such measure in the
> code implementation. So I would suggest two terminological changes:
>
> - the basic premise of the series (00/18) is that EPC pages are a
>   different resource than memory, hence choose a better suiting name
>   than OOM (out of memory) condition,

I couldn't come up a good name. Out of EPC (OOEPC) maybe? I feel OOEPC  
would be hard to read in code though. OOM was relatable as it is similar  
to normal OOM but special kind of memory :-) I'm open to any better  
suggestions.

> - killing -- (unless you have an intention to implement process
>   termination later) My current interpretation that it is rather some
>   aggressive unmapping within address space, so less confusing name for
>   that would be "reclaim".
>

yes. Killing here refers to killing enclave, analogous to killing process,
not just 'reclaim' though. I can change to always use 'killing enclave'  
explicitly.

>
>> I think EPC pages to VMs could have the same behavior, once they are  
>> given
>> to a guest, never taken back by the host. For enclaves on host side,  
>> pages
>> are reclaimable, that allows us to enforce in a similar way to memcg.
>
> Is this distinction between preemptability of EPC pages mandated by the
> HW implementation? (host/"process" enclaves vs VM enclaves) Or do have
> users an option to lock certain pages in memory that yields this
> difference?
>

The difference is really a result of current vEPC implementation. Because
enclave pages once in use contains confidential content, they need special
process to reclaim. So it's complex to implement host reclaiming guest EPCs
gracefully.

Thanks
Haitao