lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <34a337b96a5a917612c4ec4eff2b5a378c21879b.camel@intel.com>
Date:   Mon, 6 Nov 2023 22:18:30 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "mingo@...hat.com" <mingo@...hat.com>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "hpa@...or.com" <hpa@...or.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "mkoutny@...e.com" <mkoutny@...e.com>,
        "haitao.huang@...ux.intel.com" <haitao.huang@...ux.intel.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "tj@...nel.org" <tj@...nel.org>,
        "Mehta, Sohil" <sohil.mehta@...el.com>,
        "bp@...en8.de" <bp@...en8.de>
CC:     "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "Zhang, Bo" <zhanb@...rosoft.com>,
        "kristen@...ux.intel.com" <kristen@...ux.intel.com>,
        "anakrish@...rosoft.com" <anakrish@...rosoft.com>,
        "sean.j.christopherson@...el.com" <sean.j.christopherson@...el.com>,
        "Li, Zhiquan1" <zhiquan1.li@...el.com>,
        "yangjie@...rosoft.com" <yangjie@...rosoft.com>
Subject: Re: [PATCH v6 04/12] x86/sgx: Implement basic EPC misc cgroup
 functionality

> 
> > > +/**
> > > + * sgx_epc_cgroup_try_charge() - hierarchically try to charge a single  
> > > EPC page
> > > + *
> > > + * Returns EPC cgroup or NULL on success, -errno on failure.
> > > + */
> > > +struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(void)
> > > +{
> > > +	struct sgx_epc_cgroup *epc_cg;
> > > +	int ret;
> > > +
> > > +	if (sgx_epc_cgroup_disabled())
> > > +		return NULL;
> > > +
> > > +	epc_cg = sgx_epc_cgroup_from_misc_cg(get_current_misc_cg());
> > > +	ret = misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, PAGE_SIZE);
> > > +
> > > +	if (!ret) {
> > > +		/* No epc_cg returned, release ref from get_current_misc_cg() */
> > > +		put_misc_cg(epc_cg->cg);
> > > +		return ERR_PTR(-ENOMEM);
> > 
> > misc_cg_try_charge() returns 0 when successfully charged, no?
> 
> Right. I really made some mess in rebasing :-(
> 
> > 
> > > +	}
> > > +
> > > +	/* Ref released in sgx_epc_cgroup_uncharge() */
> > > +	return epc_cg;
> > > +}
> > 
> > IMHO the above _try_charge() returning a pointer of EPC cgroup is a  
> > little bit
> > odd, because it doesn't match the existing misc_cg_try_charge() which  
> > returns
> > whether the charge is successful or not.  sev_misc_cg_try_charge()  
> > matches
> > misc_cg_try_charge() too.
> > 
> > I think it's better to split "getting EPC cgroup" part out as a separate  
> > helper,
> > and make this _try_charge() match existing pattern:
> > 
> > 	struct sgx_epc_cgroup *sgx_get_current_epc_cg(void)
> > 	{
> > 		if (sgx_epc_cgroup_disabled())
> > 			return NULL;
> > 	
> > 		return sgx_epc_cgroup_from_misc_cg(get_current_misc_cg());
> > 	}
> > 
> > 	int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
> > 	{
> > 		if (!epc_cg)
> > 			return -EINVAL;
> > 	
> > 		return misc_cg_try_charge(epc_cg->cg);
> > 	}
> > 
> > Having sgx_get_current_epc_cg() also makes the caller easier to read,  
> > because we
> > can immediately know we are going to charge the *current* EPC cgroup,  
> > but not
> > some cgroup hidden within sgx_epc_cgroup_try_charge().
> > 
> 
> Actually, unlike other misc controllers, we need charge and get the epc_cg  
> reference at the same time. 
> 

Can you elaborate?

And in practice you always call sgx_epc_cgroup_try_charge() right after
sgx_get_current_epc_cg() anyway.  The only difference is the whole thing is done
in one function or in separate functions.

[...]


> > >  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
> > >  {
> > >  	struct sgx_epc_page *page;
> > > +	struct sgx_epc_cgroup *epc_cg;
> > > +
> > > +	epc_cg = sgx_epc_cgroup_try_charge();
> > > +	if (IS_ERR(epc_cg))
> > > +		return ERR_CAST(epc_cg);
> > > 
> > >  	for ( ; ; ) {
> > >  		page = __sgx_alloc_epc_page();
> > > @@ -580,10 +587,21 @@ struct sgx_epc_page *sgx_alloc_epc_page(void  
> > > *owner, bool reclaim)
> > >  			break;
> > >  		}
> > > 
> > > +		/*
> > > +		 * Need to do a global reclamation if cgroup was not full but free
> > > +		 * physical pages run out, causing __sgx_alloc_epc_page() to fail.
> > > +		 */
> > >  		sgx_reclaim_pages();
> > 
> > What's the final behaviour?  IIUC it should be reclaiming from the  
> > *current* EPC
> > cgroup?  If so shouldn't we just pass the @epc_cg to it here?
> > 
> > I think we can make this patch as "structure" patch w/o actually having  
> > EPC
> > cgroup enabled, i.e., sgx_get_current_epc_cg() always return NULL.
> > 
> > And we can have one patch to change sgx_reclaim_pages() to take the  
> > 'struct
> > sgx_epc_lru_list *' as argument:
> > 
> > 	void sgx_reclaim_pages_lru(struct sgx_epc_lru_list * lru)
> > 	{
> > 		...
> > 	}
> > 
> > Then here we can have something like:
> > 
> > 	void sgx_reclaim_pages(struct sgx_epc_cg *epc_cg)
> > 	{
> > 		struct sgx_epc_lru_list *lru =			epc_cg ? &epc_cg->lru :  
> > &sgx_global_lru;
> > 
> > 		sgx_reclaim_pages_lru(lru);
> > 	}
> > 
> > Makes sense?
> > 
> 
> This is purely global reclamation. No cgroup involved. 
> 

Again why?  Here you are allocating one EPC page for enclave in a particular EPC
cgroup.  When that fails, shouldn't you try only to reclaim from the *current*
EPC cgroup?  Or at least you should try to reclaim from the *current* EPC cgroup
first?

> You can see it  
> later in changes in patch 10/12. For now I just make a comment there but  
> no real changes. Cgroup reclamation will be done as part of _try_charge  
> call.
> 
> > >  		cond_resched();
> > >  	}
> > > 
> > > +	if (!IS_ERR(page)) {
> > > +		WARN_ON_ONCE(page->epc_cg);
> > > +		page->epc_cg = epc_cg;
> > > +	} else {
> > > +		sgx_epc_cgroup_uncharge(epc_cg);
> > > +	}
> > > +
> > >  	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> > >  		wake_up(&ksgxd_waitq);
> > > 
> > > @@ -604,6 +622,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
> > >  	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
> > >  	struct sgx_numa_node *node = section->node;
> > > 
> > > +	if (page->epc_cg) {
> > > +		sgx_epc_cgroup_uncharge(page->epc_cg);
> > > +		page->epc_cg = NULL;
> > > +	}
> > > +
> > >  	spin_lock(&node->lock);
> > > 
> > >  	page->owner = NULL;
> > > @@ -643,6 +666,7 @@ static bool __init sgx_setup_epc_section(u64  
> > > phys_addr, u64 size,
> > >  		section->pages[i].flags = 0;
> > >  		section->pages[i].owner = NULL;
> > >  		section->pages[i].poison = 0;
> > > +		section->pages[i].epc_cg = NULL;
> > >  		list_add_tail(&section->pages[i].list, &sgx_dirty_page_list);
> > >  	}
> > > 
> > > @@ -787,6 +811,7 @@ static void __init arch_update_sysfs_visibility(int  
> > > nid) {}
> > >  static bool __init sgx_page_cache_init(void)
> > >  {
> > >  	u32 eax, ebx, ecx, edx, type;
> > > +	u64 capacity = 0;
> > >  	u64 pa, size;
> > >  	int nid;
> > >  	int i;
> > > @@ -837,6 +862,7 @@ static bool __init sgx_page_cache_init(void)
> > > 
> > >  		sgx_epc_sections[i].node =  &sgx_numa_nodes[nid];
> > >  		sgx_numa_nodes[nid].size += size;
> > > +		capacity += size;
> > > 
> > >  		sgx_nr_epc_sections++;
> > >  	}
> > > @@ -846,6 +872,8 @@ static bool __init sgx_page_cache_init(void)
> > >  		return false;
> > >  	}
> > > 
> > > +	misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, capacity);

Hmm.. I think this is why MISC_CG_RES_SGX_EPC is needed when
!CONFIG_CGROUP_SGX_EPC.

> > > +
> > >  	return true;
> > >  }
> > 
> > I would separate setting up capacity as a separate patch.
> 
> I thought about that, but again it was only 3-4 lines all in this function  
> and it's also necessary part of basic setup for misc controller...

Fine.  Anyway it depends on what things you want to do on this patch. It's fine
to include the capacity if this patch is some "structure" patch that shows the
high level flow of how EPC cgroup works.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ