lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 06 May 2024 22:21:39 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: jarkko@...nel.org, dave.hansen@...ux.intel.com, tj@...nel.org,
 mkoutny@...e.com, linux-kernel@...r.kernel.org, linux-sgx@...r.kernel.org,
 x86@...nel.org, cgroups@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
 bp@...en8.de, hpa@...or.com, sohil.mehta@...el.com,
 tim.c.chen@...ux.intel.com, "Huang, Kai" <kai.huang@...el.com>
Cc: zhiquan1.li@...el.com, kristen@...ux.intel.com, seanjc@...gle.com,
 zhanb@...rosoft.com, anakrish@...rosoft.com, mikko.ylinen@...ux.intel.com,
 yangjie@...rosoft.com, chrisyan@...rosoft.com
Subject: Re: [PATCH v13 12/14] x86/sgx: Turn on per-cgroup EPC reclamation

On Mon, 06 May 2024 19:10:42 -0500, Huang, Kai <kai.huang@...el.com> wrote:

>
>
> On 1/05/2024 7:51 am, Haitao Huang wrote:
>>     static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
>>   {
>> -	sgx_reclaim_pages(&sgx_global_lru, charge_mm);
>> +	if (IS_ENABLED(CONFIG_CGROUP_MISC))
>> +		sgx_cgroup_reclaim_pages(misc_cg_root(), charge_mm);
>> +	else
>> +		sgx_reclaim_pages(&sgx_global_lru, charge_mm);
>>   }
>>
>
> I think we have a problem here when we do global reclaim starting from  
> the ROOT cgroup:
>
> This function will mostly just only try to reclaim from the ROOT cgroup,  
> but won't reclaim from the descendants.
>
> The reason is the sgx_cgroup_reclaim_pages() will simply return after  
> "scanning" SGX_NR_TO_SCAN (16) pages w/o going to the descendants, and  
> the "scanning" here simply means "removing the EPC page from the  
> cgroup's LRU list".
>
> So as long as the ROOT cgroup LRU contains more than SGX_NR_TO_SCAN (16)  
> pages, effectively sgx_cgroup_reclaim_pages() will just scan and return  
> w/o going into the descendants.  Having 16 EPC pages should be a "almost  
> always true" case I suppose.
>
> When the sgx_reclaim_pages_global() is called again, we will start from  
> the ROOT again.
>
> That means the this doesn't truly reclaim "from global" at all.
>
> IMHO the behaviour of sgx_cgroup_reclaim_pages() is OK for per-cgroup  
> reclaim because I think in this case our intention is we should try best  
> to reclaim from the cgroup, i.e., whether we can reclaim from  
> descendants doesn't matter.
>
> But for global reclaim this doesn't work.
>
> Am I missing anything?
>
Good catch. This is indeed a problem if pages in a higher level cgroup are  
always busy (being 'young').The reclamation loop starting from this group  
may be stuck in only shifting the pages from front to tail in this group  
and never tries to scan & reclaim pages in its descendants.

Though this may not happen often, I think it does require a fix. Will do  
it in v14 :-)

Thanks
Haitao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ