linux-kernel - Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <op.2jrquskiwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Mon, 26 Feb 2024 15:48:18 -0600
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "Huang, Kai" <kai.huang@...el.com>, "tj@...nel.org" <tj@...nel.org>,
 "jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
 "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
 "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
 <hpa@...or.com>, "mingo@...hat.com" <mingo@...hat.com>,
 "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>, "mkoutny@...e.com"
 <mkoutny@...e.com>, "Mehta, Sohil" <sohil.mehta@...el.com>,
 "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "tglx@...utronix.de" <tglx@...utronix.de>, "bp@...en8.de" <bp@...en8.de>,
 "Dave Hansen" <dave.hansen@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
 "seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
 <anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
 "kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
 <yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
 "chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup
 try_charge()

Hi Dave,

On Mon, 26 Feb 2024 08:04:54 -0600, Dave Hansen <dave.hansen@...el.com>  
wrote:

> On 2/26/24 03:36, Huang, Kai wrote:
>>> In case of overcomitting, even if we always reclaim from the same  
>>> cgroup
>>> for each fault, one group may still interfere the other: e.g.,  
>>> consider an
>>> extreme case in that group A used up almost all EPC at the time group B
>>> has a fault, B has to fail allocation and kill enclaves.
>> If the admin allows group A to use almost all EPC, to me it's fair to  
>> say he/she
>> doesn't want to run anything inside B at all and it is acceptable  
>> enclaves in B
>> to be killed.
>
> Folks, I'm having a really hard time following this thread.  It sounds
> like there's disagreement about when to do system-wide reclaim.  Could
> someone remind me of the choices that we have?  (A proposed patch would
> go a _long_ way to helping me understand)
>

In case of overcomitting, i.e., sum of limits greater than the EPC  
capacity, if one group has a fault, and its usage is not above its own  
limit (try_charge() passes), yet total usage of the system has exceeded  
the capacity, whether we do global reclaim or just reclaim pages in the  
current faulting group.

> Also, what does the core mm memcg code do?
>
I'm not sure. I'll try to find out but it'd be appreciated if someone more  
knowledgeable can comment on this. memcg also has the protection mechanism  
(i.e., min, low settings) to guarantee some allocation per group so its  
approach might not be applicable to misc controller here.

> Last, what is the simplest (least amount of code) thing that the SGX
> cgroup controller could implement here?
>
>

I still think the current approach of doing global reclaim is reasonable  
and simple: try_charge() checks cgroup limit and reclaim within the group  
if needed, then do EPC page allocation, reclaim globally if allocation  
fails due to global usage reaches the capacity.

I'm not sure how not doing global reclaiming in this case would bring any  
benefit. Please see my response to Kai's example cases.

Thanks
Haitao