linux-kernel - Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <op.2g1eoojxwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Thu, 04 Jan 2024 13:20:38 -0600
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: Michal Koutný <mkoutny@...e.com>
Cc: "Mehta, Sohil" <sohil.mehta@...el.com>, "jarkko@...nel.org"
 <jarkko@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
 "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
 "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
 <hpa@...or.com>, "mingo@...hat.com" <mingo@...hat.com>, "tj@...nel.org"
 <tj@...nel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
 "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "bp@...en8.de"
 <bp@...en8.de>, "Huang, Kai" <kai.huang@...el.com>,
 "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
 "seanjc@...gle.com" <seanjc@...gle.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
 "kristen@...ux.intel.com" <kristen@...ux.intel.com>, "anakrish@...rosoft.com"
 <anakrish@...rosoft.com>, "sean.j.christopherson@...el.com"
 <sean.j.christopherson@...el.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
 "yangjie@...rosoft.com" <yangjie@...rosoft.com>
Subject: Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim
 function

Hi Michal,

On Thu, 04 Jan 2024 06:38:41 -0600, Michal Koutný <mkoutny@...e.com> wrote:

> Hello.
>
> On Mon, Dec 18, 2023 at 03:24:40PM -0600, Haitao Huang  
> <haitao.huang@...ux.intel.com> wrote:
>> Thanks for raising this. Actually my understanding the above discussion  
>> was
>> mainly about not doing reclaiming by killing enclaves, i.e., I assumed
>> "reclaiming" within that context only meant for that particular kind.
>>
>> As Mikko pointed out, without reclaiming per-cgroup, the max limit of  
>> each
>> cgroup needs to accommodate the peak usage of enclaves within that  
>> cgroup.
>> That may be inconvenient for practical usage and limits could be forced  
>> to
>> be set larger than necessary to run enclaves performantly. For example,  
>> we
>> can observe following undesired consequences comparing a system with  
>> current
>> kernel loaded with enclaves whose total peak usage is greater than the  
>> EPC
>> capacity.
>>
>> 1) If a user wants to load the same exact enclaves but in separate  
>> cgroups,
>> then the sum of cgroup limits must be higher than the capacity and the
>> system will end up doing the same old global reclaiming as it is  
>> currently
>> doing. Cgroup is not useful at all for isolating EPC consumptions.
>
> That is the use of limits to prevent a runaway cgroup smothering the
> system. Overcommited values in such a config are fine because the more
> simultaneous runaways, the less likely.
> The peak consumption is on the fair expense of others (some efficiency)
> and the limit contains the runaway (hence the isolation).
>

This makes sense to me in theory. Mikko, Chris Y/Bo Z, your thoughts on  
whether this is good enough for your intended usages?

>> 2) To isolate impact of usage of each cgroup on other cgroups and yet  
>> still
>> being able to load each enclave, the user basically has to carefully  
>> plan to
>> ensure the sum of cgroup max limits, i.e., the sum of peak usage of
>> enclaves, is not reaching over the capacity. That means no  
>> over-commiting
>> allowed and the same system may not be able to load as many enclaves as  
>> with
>> current kernel.
>
> Another "config layout" of limits is to achieve partitioning (sum ==
> capacity). That is perfect isolation but it naturally goes against
> efficient utilization. The way other controllers approach this trade-off
> is with weights (cpu, io) or protections (memory). I'm afraid misc
> controller is not ready for this.
>
> My opinion is to start with the simple limits (first patches) and think
> of prioritization/guarantee mechanism based on real cases.
>

We moved away from using mem like custom controller with (low, high, max)  
to misc controller. But if we need add those down the road, then the  
interface needs be changed. So my concern on this route would be whether  
misc would allow any of those extensions. On the other hand, it might turn  
out less complex just doing the reclamation per cgroup.

Thanks a lot for your comments and they are really helpful!

Haitao