lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Feb 2024 16:57:40 -0600
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "hpa@...or.com" <hpa@...or.com>, "tim.c.chen@...ux.intel.com"
 <tim.c.chen@...ux.intel.com>, "linux-sgx@...r.kernel.org"
 <linux-sgx@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
 "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
 "jarkko@...nel.org" <jarkko@...nel.org>, "cgroups@...r.kernel.org"
 <cgroups@...r.kernel.org>, "linux-kernel@...r.kernel.org"
 <linux-kernel@...r.kernel.org>, "mkoutny@...e.com" <mkoutny@...e.com>,
 "tglx@...utronix.de" <tglx@...utronix.de>, "Mehta, Sohil"
 <sohil.mehta@...el.com>, "tj@...nel.org" <tj@...nel.org>, "mingo@...hat.com"
 <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>, "Huang, Kai"
 <kai.huang@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
 "seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
 <anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
 "kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
 <yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
 "chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup
 try_charge()

On Thu, 22 Feb 2024 15:26:05 -0600, Huang, Kai <kai.huang@...el.com> wrote:

>
>
> On 23/02/2024 6:09 am, Haitao Huang wrote:
>> On Wed, 21 Feb 2024 05:06:02 -0600, Huang, Kai <kai.huang@...el.com>  
>> wrote:
>>
>>>
>>>> -int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
>>>> +int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, bool  
>>>> reclaim)
>>>>  {
>>>> -    return misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,  
>>>> PAGE_SIZE);
>>>> +    for (;;) {
>>>> +        if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
>>>> +                    PAGE_SIZE))
>>>> +            break;
>>>> +
>>>> +        if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
>>>> +            return -ENOMEM;
>>>> +
>>>> +        if (signal_pending(current))
>>>> +            return -ERESTARTSYS;
>>>> +
>>>> +        if (!reclaim) {
>>>> +            queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
>>>> +            return -EBUSY;
>>>> +        }
>>>> +
>>>> +        if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, false))
>>>> +            /* All pages were too young to reclaim, try again a  
>>>> little later */
>>>> +            schedule();
>>>> +    }
>>>> +
>>>> +    return 0;
>>>>  }
>>>>
>>>
>>> Seems this code change is 90% similar to the existing code in the
>>> sgx_alloc_epc_page():
>>>
>>>     ...
>>>     for ( ; ; ) {
>>>                 page = __sgx_alloc_epc_page();
>>>                 if (!IS_ERR(page)) {
>>>                         page->owner = owner;
>>>                         break;
>>>                 }
>>>
>>>                 if (list_empty(&sgx_active_page_list))
>>>                         return ERR_PTR(-ENOMEM);
>>>
>>>                 if (!reclaim) {
>>>                         page = ERR_PTR(-EBUSY);
>>>                         break;
>>>                 }
>>>
>>>                 if (signal_pending(current)) {
>>>                         page = ERR_PTR(-ERESTARTSYS);
>>>                         break;
>>>                 }
>>>
>>>                 sgx_reclaim_pages();
>>>                 cond_resched();
>>>         }
>>>     ...
>>>
>>> Is it better to move the logic/code change in try_charge() out to
>>> sgx_alloc_epc_page() to unify them?
>>>
>>> IIUC, the logic is quite similar: When you either failed to allocate  
>>> one page,
>>> or failed to charge one page, you try to reclaim EPC page(s) from the  
>>> current
>>> EPC cgroup, either directly or indirectly.
>>>
>>> No?
>>  Only these lines are the same:
>>                  if (!reclaim) {
>>                          page = ERR_PTR(-EBUSY);
>>                          break;
>>                  }
>>                   if (signal_pending(current)) {
>>                          page = ERR_PTR(-ERESTARTSYS);
>>                          break;
>>                  }
>>  In sgx_alloc_epc_page() we do global reclamation but here we do  
>> per-cgroup reclamation.
>
> But why?  If we failed to allocate, shouldn't we try to reclaim from the  
> _current_ EPC cgroup instead of global?  E.g., I thought one enclave in  
> one EPC cgroup requesting insane amount of EPC shouldn't impact enclaves  
> inside other cgroups?
>
Right. When code reaches to here, we already passed reclaim per cgroup.  
The cgroup may not at or reach limit but system has run out of physical  
EPC.

Thanks
Haitao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ