linux-kernel - Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <431c5d7f5aee7d11ec2e8aa2e526fde438fa53b4.camel@intel.com>
Date:   Mon, 27 Nov 2023 09:57:03 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "mingo@...hat.com" <mingo@...hat.com>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "hpa@...or.com" <hpa@...or.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "mkoutny@...e.com" <mkoutny@...e.com>,
        "haitao.huang@...ux.intel.com" <haitao.huang@...ux.intel.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "tj@...nel.org" <tj@...nel.org>,
        "Mehta, Sohil" <sohil.mehta@...el.com>,
        "bp@...en8.de" <bp@...en8.de>
CC:     "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
        "seanjc@...gle.com" <seanjc@...gle.com>,
        "Zhang, Bo" <zhanb@...rosoft.com>,
        "kristen@...ux.intel.com" <kristen@...ux.intel.com>,
        "anakrish@...rosoft.com" <anakrish@...rosoft.com>,
        "sean.j.christopherson@...el.com" <sean.j.christopherson@...el.com>,
        "Li, Zhiquan1" <zhiquan1.li@...el.com>,
        "yangjie@...rosoft.com" <yangjie@...rosoft.com>
Subject: Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim
 function

On Mon, 2023-11-27 at 00:27 +0800, Haitao Huang wrote:
> On Mon, 20 Nov 2023 11:45:46 +0800, Huang, Kai <kai.huang@...el.com> wrote:
> 
> > On Mon, 2023-10-30 at 11:20 -0700, Haitao Huang wrote:
> > > From: Sean Christopherson <sean.j.christopherson@...el.com>
> > > 
> > > To prepare for per-cgroup reclamation, separate the top-level reclaim
> > > function, sgx_reclaim_epc_pages(), into two separate functions:
> > > 
> > > - sgx_isolate_epc_pages() scans and isolates reclaimable pages from a  
> > > given LRU list.
> > > - sgx_do_epc_reclamation() performs the real reclamation for the  
> > > already isolated pages.
> > > 
> > > Create a new function, sgx_reclaim_epc_pages_global(), calling those two
> > > in succession, to replace the original sgx_reclaim_epc_pages(). The
> > > above two functions will serve as building blocks for the reclamation
> > > flows in later EPC cgroup implementation.
> > > 
> > > sgx_do_epc_reclamation() returns the number of reclaimed pages. The EPC
> > > cgroup will use the result to track reclaiming progress.
> > > 
> > > sgx_isolate_epc_pages() returns the additional number of pages to scan
> > > for current epoch of reclamation. The EPC cgroup will use the result to
> > > determine if more scanning to be done in LRUs in its children groups.
> > 
> > This changelog says nothing about "why", but only mentions the  
> > "implementation".
> > 
> > For instance, assuming we need to reclaim @npages_to_reclaim from the
> > @epc_cgrp_to_reclaim and its descendants, why cannot we do:
> > 
> > 	for_each_cgroup_and_descendants(&epc_cgrp_to_reclaim, &epc_cgrp) {
> > 		if (npages_to_reclaim <= 0)
> > 			return;
> > 
> > 		npages_to_reclaim -= sgx_reclaim_pages_lru(&epc_cgrp->lru,
> > 					npages_to_reclaim);
> > 	}
> > 
> > Is there any difference to have "isolate" + "reclaim"?
> > 
> 
> This is to optimize "reclaim". See how etrack was done in sgx_encl_ewb.
> Here we just follow the same design as ksgxd for each reclamation cycle.

I don't see how did you "follow" ksgxd.  If I am guessing correctly, you are
afraid of there might be less than 16 pages in a given EPC cgroup, thus w/o
splitting into "isolate" + "reclaim" you might feed the "reclaim" less than 16
pages, which might cause some performance degrade?

But is this a common case?  Should we even worry about this?

I suppose for such new feature we should bring functionality first and then
optimization if you have real performance data to show.

> 
> > > 
> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
> > > Co-developed-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
> > > Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
> > > Co-developed-by: Haitao Huang <haitao.huang@...ux.intel.com>
> > > Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
> > > Cc: Sean Christopherson <seanjc@...gle.com>
> > > ---
> > > 
> > 
> > [...]
> > 
> > > +/**
> > > + * sgx_do_epc_reclamation() - Perform reclamation for isolated EPC  
> > > pages.
> > > + * @iso:		List of isolated pages for reclamation
> > > + *
> > > + * Take a list of EPC pages and reclaim them to the enclave's private  
> > > shmem files.  Do not
> > > + * reclaim the pages that have been accessed since the last scan, and  
> > > move each of those pages
> > > + * to the tail of its tracking LRU list.
> > > + *
> > > + * Limit the number of pages to be processed up to SGX_NR_TO_SCAN_MAX  
> > > per call in order to
> > > + * degrade amount of IPI's and ETRACK's potentially required.  
> > > sgx_encl_ewb() does degrade a bit
> > > + * among the HW threads with three stage EWB pipeline (EWB, ETRACK +  
> > > EWB and IPI + EWB) but not
> > > + * sufficiently. Reclaiming one page at a time would also be  
> > > problematic as it would increase
> > > + * the lock contention too much, which would halt forward progress.
> > 
> > This is kinda optimization, correct?  Is there any real performance data  
> > to
> > justify this?
> 
> The above sentences were there originally. This optimization was justified.

I am talking about 16 -> 32.

> 
> > If this optimization is useful, shouldn't we bring this
> > optimization to the current sgx_reclaim_pages() instead, e.g., just  
> > increase
> > SGX_NR_TO_SCAN (16) to SGX_NR_TO_SCAN_MAX (32)?
> > 
> 
> SGX_NR_TO_SCAN_MAX might be designed earlier for other reasons I don't  
> know. Currently it is really the buffer size allocated in  
> sgx_reclaim_pages(). Both cgroup and ksgxd scan 16 pages a time.Maybe we  
> should just use SGX_NR_TO_SCAN. No _MAX needed. The point was to batch  
> reclamation to certain number to minimize impact of EWB pipeline. 16 was  
> the original design.
> 

Please don't leave why you are trying to do this to the reviewers.  If you don't
know, then just drop this.