lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 09 Oct 2023 20:38:12 -0500
From:   "Haitao Huang" <haitao.huang@...ux.intel.com>
To:     "mingo@...hat.com" <mingo@...hat.com>,
        "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "hpa@...or.com" <hpa@...or.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "bp@...en8.de" <bp@...en8.de>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "tj@...nel.org" <tj@...nel.org>,
        "Mehta, Sohil" <sohil.mehta@...el.com>,
        "Huang, Kai" <kai.huang@...el.com>
Cc:     "kristen@...ux.intel.com" <kristen@...ux.intel.com>,
        "anakrish@...rosoft.com" <anakrish@...rosoft.com>,
        "Li, Zhiquan1" <zhiquan1.li@...el.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
        "yangjie@...rosoft.com" <yangjie@...rosoft.com>,
        "Zhang, Bo" <zhanb@...rosoft.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim
 EPC

On Mon, 09 Oct 2023 20:18:00 -0500, Huang, Kai <kai.huang@...el.com> wrote:

> On Mon, 2023-10-09 at 20:04 -0500, Haitao Huang wrote:
>> On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai <kai.huang@...el.com>  
>> wrote:
>>
>> > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote:
>> > > From: Sean Christopherson <sean.j.christopherson@...el.com>
>> > >
>> > > Introduce the OOM path for killing an enclave with a reclaimer that  
>> is
>> > > no
>> > > longer able to reclaim enough EPC pages. Find a victim enclave,  
>> which
>> > > will be an enclave with only "unreclaimable" EPC pages left in the
>> > > cgroup LRU lists. Once a victim is identified, mark the enclave as  
>> OOM
>> > > and zap the enclave's entire page range, and drain all mm  
>> references in
>> > > encl->mm_list. Block allocating any EPC pages in #PF handler, or
>> > > reloading any pages in all paths, or creating any new mappings.
>> > >
>> > > The OOM killing path may race with the reclaimers: in some cases,  
>> the
>> > > victim enclave is in the process of reclaiming the last EPC pages  
>> when
>> > > OOM happens, that is, all pages other than SECS and VA pages are in
>> > > RECLAIMING_IN_PROGRESS state. The reclaiming process requires  
>> access to
>> > > the enclave backing, VA pages as well as SECS. So the OOM killer  
>> does
>> > > not directly release those enclave resources, instead, it lets all
>> > > reclaiming in progress to finish, and relies (as currently done) on
>> > > kref_put on encl->refcount to trigger sgx_encl_release() to do the
>> > > final cleanup.
>> > >
>> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
>> > > Co-developed-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> > > Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
>> > > Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> > > Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
>> > > Cc: Sean Christopherson <seanjc@...gle.com>
>> > > ---
>> > > V5:
>> > > - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY
>> > >
>> > > V4:
>> > > - Updates for patch reordering and typo fixes.
>> > >
>> > > V3:
>> > > - Rebased to use the new VMA_ITERATOR to zap VMAs.
>> > > - Fixed the racing cases by blocking new page allocation/mapping and
>> > > reloading when enclave is marked for OOM. And do not release any  
>> enclave
>> > > resources other than draining mm_list entries, and let pages in
>> > > RECLAIMING_IN_PROGRESS to be reaped by reclaimers.
>> > > - Due to above changes, also removed the no-longer needed  
>> encl->lock in
>> > > the OOM path which was causing deadlocks reported by the lock  
>> prover.
>> > >
>> >
>> > [...]
>> >
>> > > +
>> > > +/**
>> > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU
>> > > + * @lru:	LRU that is low
>> > > + *
>> > > + * Return:	%true if a victim was found and kicked.
>> > > + */
>> > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru)
>> > > +{
>> > > +	struct sgx_epc_page *victim;
>> > > +
>> > > +	spin_lock(&lru->lock);
>> > > +	victim = sgx_oom_get_victim(lru);
>> > > +	spin_unlock(&lru->lock);
>> > > +
>> > > +	if (!victim)
>> > > +		return false;
>> > > +
>> > > +	if (victim->flags & SGX_EPC_OWNER_PAGE)
>> > > +		return sgx_oom_encl_page(victim->encl_page);
>> > > +
>> > > +	if (victim->flags & SGX_EPC_OWNER_ENCL)
>> > > +		return sgx_oom_encl(victim->encl);
>> >
>> > I hate to bring this up, at least at this stage, but I am wondering  
>> why
>> > we need
>> > to put VA and SECS pages to the unreclaimable list, but cannot keep an
>> > "enclave_list" instead?
>> >
>> > So by looking the patch (" x86/sgx: Limit process EPC usage with misc
>> > cgroup
>> > controller"), if I am not missing anything, the whole "unreclaimable"
>> > list is
>> > just used to find the victim enclave when OOM needs to be done.   
>> Thus, I
>> > don't
>> > see why "enclave_list" cannot be used to achieve this.
>> >
>> > The reason that I am asking is because it seems using "enclave_list"  
>> we
>> > can
>> > simplify the code.  At least the patches related to track VA/SECS  
>> pages,
>> > and the
>> > SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated
>> > completely.
>> > Using "enclave_list", I guess you just need to put the enclave to the
>> > current
>> > EPC cgroup when SECS page is allocated.
>> >
>> Later the hosting process could migrated/reassigned to another cgroup?
>> What to do when the new cgroup is OOM?
>>
>
> You addressed in the documentation, no?
>
> +Migration
> +---------
> +
> +Once an EPC page is charged to a cgroup (during allocation), it
> +remains charged to the original cgroup until the page is released
> +or reclaimed.  Migrating a process to a different cgroup doesn't
> +move the EPC charges that it incurred while in the previous cgroup
> +to its new cgroup.

Should we kill the enclave though because some VA pages may be in the new  
group?

Haitao

Powered by blists - more mailing lists