lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26f5ff14-077f-4bb6-90d8-ea83509ff682@redhat.com>
Date:   Wed, 7 Sep 2022 10:44:44 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Rebecca Mckeever <remckee0@...il.com>
Cc:     Mike Rapoport <rppt@...nel.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/4] memblock tests: add simulation of physical memory
 with multiple NUMA nodes

On 07.09.22 01:43, Rebecca Mckeever wrote:
> On Tue, Sep 06, 2022 at 03:17:46PM +0200, David Hildenbrand wrote:
>> On 04.09.22 06:21, Rebecca Mckeever wrote:
>>> Add function setup_numa_memblock() for setting up a memory layout with
>>> multiple NUMA nodes in a previously allocated dummy physical memory.
>>> This function can be used in place of setup_memblock() in tests that need
>>> to simulate a NUMA system.
>>>
>>> setup_numa_memblock():
>>> - allows for setting up a memory layout by specifying the fraction of
>>>     MEM_SIZE in each node
>>>
>>> Set CONFIG_NODES_SHIFT to 4 when building with NUMA=1 to allow for up to
>>> 16 NUMA nodes.
>>>
>>> Signed-off-by: Rebecca Mckeever <remckee0@...il.com>
>>> ---
>>>    .../testing/memblock/scripts/Makefile.include |  2 +-
>>>    tools/testing/memblock/tests/common.c         | 29 +++++++++++++++++++
>>>    tools/testing/memblock/tests/common.h         |  4 ++-
>>>    3 files changed, 33 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/testing/memblock/scripts/Makefile.include b/tools/testing/memblock/scripts/Makefile.include
>>> index aa6d82d56a23..998281723590 100644
>>> --- a/tools/testing/memblock/scripts/Makefile.include
>>> +++ b/tools/testing/memblock/scripts/Makefile.include
>>> @@ -3,7 +3,7 @@
>>>    # Simulate CONFIG_NUMA=y
>>>    ifeq ($(NUMA), 1)
>>> -	CFLAGS += -D CONFIG_NUMA
>>> +	CFLAGS += -D CONFIG_NUMA -D CONFIG_NODES_SHIFT=4
>>>    endif
>>>    # Use 32 bit physical addresses.
>>> diff --git a/tools/testing/memblock/tests/common.c b/tools/testing/memblock/tests/common.c
>>> index eec6901081af..b6110df21b2a 100644
>>> --- a/tools/testing/memblock/tests/common.c
>>> +++ b/tools/testing/memblock/tests/common.c
>>> @@ -72,6 +72,35 @@ void setup_memblock(void)
>>>    	fill_memblock();
>>>    }
>>> +/**
>>> + * setup_numa_memblock:
>>> + * Set up a memory layout with multiple NUMA nodes in a previously allocated
>>> + * dummy physical memory.
>>> + * @nodes: an array containing the denominators of the fractions of MEM_SIZE
>>> + *         contained in each node (e.g., if nodes[0] = SZ_8, node 0 will
>>> + *         contain 1/8th of MEM_SIZE)
>>> + *
>>> + * The nids will be set to 0 through NUMA_NODES - 1.
>>> + */
>>> +void setup_numa_memblock(const phys_addr_t nodes[])
>>> +{
>>> +	phys_addr_t base;
>>> +	int flags;
>>> +
>>> +	reset_memblock_regions();
>>> +	base = (phys_addr_t)memory_block.base;
>>> +	flags = (movable_node_is_enabled()) ? MEMBLOCK_NONE : MEMBLOCK_HOTPLUG;
>>> +
>>> +	for (int i = 0; i < NUMA_NODES; i++) {
>>> +		assert(nodes[i] <= MEM_SIZE && nodes[i] > 0);
>>
>> I think it would be even easier to get if this would just be a fraction.
>> E.g., instead of "1/8 * MEM_SIZE" just "1/8". All values have to add up to
>> 1.
>>
>> ... but then we'd have to mess with floats eventually, so I guess this makes
>> it easier to handle these fractions.
>>
>>
>> We could use "int" and simply specify the fraction in percent, like
>>
>> nodes[0] = 50;
>> nodes[1] = 25;
>> nodes[2] = 25;
>>
>> and everything has to add up to 100.
>>
> This would still be a float for 1/8th (12.5) and 1/16th (6.25). What if
> it was the "percent" of 256 (i.e., 0x100)?

Right, or in something "smaller" like 1/32 th. I don't think we go below 
that?

If we don't need more digits, why not in "basis points" (per ten thousand)
-> https://en.wikipedia.org/wiki/Basis_point

nodes[0] = 5000; /* 1/2  */
nodes[1] = 2500; /* 1/4 */
nodes[2] = 1250; /* 1/8 */
nodes[4] = 0625; /* 1/32 */
nodes[5] = 0625;


>>
>>> +		phys_addr_t size = MEM_SIZE / nodes[i];
>>
>>
>> Hmmm, assuming a single node with "MEM_SIZE", we would get size=1.
>>
> For a single node of MEM_SIZE, nodes[0] would be 1.
> 
>> Shouldn't this be "size = nodes[i]"
>>
>> ?
> No, not with the current implementation. The nodes array stores the
> denominator of the fraction that will be multiplied by MEM_SIZE to
> determine the size of that node (the numerator is always 1). So if the
> size of the node should be 1/8 * MEM_SIZE, the nodes array just stores
> the 8. I think the name of the array is misleading. Do you have any
> suggestions for a better name?

Then I am confused about the
	assert(nodes[i] <= MEM_SIZE && nodes[i] > 0);

assertion :)

I think it would really be best to just store the actual fraction 
somehow. But maybe just I am confused :)

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ