[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c710eea8-a0ea-70b6-6521-0dd685adbb06@intel.com>
Date: Wed, 10 Mar 2021 07:44:39 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Jarkko Sakkinen <jarkko@...nel.org>
Cc: linux-sgx@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 5/5] x86/sgx: Add a basic NUMA allocation scheme to
sgx_alloc_epc_page()
>>> + * node.
>>> + */
>>> +static struct sgx_numa_node *sgx_numa_nodes;
>>> +
>>> +/*
>>> + * sgx_free_epc_page() uses this to find out the correct struct sgx_numa_node,
>>> + * to put the page in.
>>> + */
>>> +static int sgx_section_to_numa_node_id[SGX_MAX_EPC_SECTIONS];
>>
>> If this is per-section, why not put it in struct sgx_epc_section?
>
> Because struct sgx_epc_page does not contain a pointer to
> struct sgx_epc_section.
Currently, we have epc_page->section. That's not changing. But, with
the free list moving from sgx_epc_section to sgx_numa_node, we need a
way to get from page->node, not just page->section.
We can either add that to:
struct sgx_epc_section {
...
+ struct sgx_numa_node *node;
}
so we can do epc_page->section->node to find the epc_page's free list,
or we could just do:
struct sgx_epc_page {
- unsigned int section;
+ unsigned int node;
unsigned int flags;
struct sgx_encl_page *owner;
struct list_head list;
struct list_head numa_list;
};
and go from page->node directly.
>>> page = list_first_entry(&sgx_free_page_list, struct sgx_epc_page, list);
>>> + list_del_init(&page->numa_list);
>>> list_del_init(&page->list);
>>> sgx_nr_free_pages--;
>>
>> I would much rather prefer that this does what the real page allocator
>> does: kep the page on a single list. That list is maintained
>> per-NUMA-node. Allocations try local NUMA node structures, then fall
>> back to other structures (hopefully in a locality-aware fashion).
>>
>> I wrote you the loop that I want to see this implement in an earlier
>> review. This, basically:
>>
>> page = NULL;
>> nid = numa_node_id();
>> while (true) {
>> page = __sgx_alloc_epc_page_from_node(nid);
>> if (page)
>> break;
>>
>> nid = // ... some search here, next_node_in()...
>> // check if we wrapped around:
>> if (nid == numa_node_id())
>> break;
>> }
>>
>> There's no global list. You just walk around nodes trying to find one
>> with space. If you wrap around, you stop.
>>
>> Please implement this. If you think it's a bad idea, or can't, let's
>> talk about it in advance. Right now, it appears that my review comments
>> aren't being incorporated into newer versions.
>
> How I interpreted your earlier comments is that the fallback is unfair and
> this patch set version does fix that.
>
> I can buy the above allocation scheme, but I don't think this patch set
> version is a step backwards. The things done to struct sgx_epc_section
> are exactly what should be done to it.
To me, it's a step backwards. It regresses in that it falls back to an
entirely non-NUMA aware allocation mechanism. The global list is
actually likely to be even worse than the per-section searches because
it has a global lock as opposed to the at per-section locks. It also
has the overhead of managing two lists instead of one.
So, yes, it is *fair* in terms of NUMA node pressure. But being fair in
a NUMA-aware allocator by simply not doing NUMA at all is a regression.
> Implementation-wise you are asking me to squash 4/5 and 5/5 into a single
> patch, and remove global list. It's a tiny iteration from this patch
> version and I can do it.
Sounds good.
Powered by blists - more mailing lists