lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44dff493-9d79-4343-ba81-0c262d7a5b4e@redhat.com>
Date: Fri, 11 Apr 2025 15:04:16 +1000
From: Gavin Shan <gshan@...hat.com>
To: David Hildenbrand <david@...hat.com>, Oscar Salvador <osalvador@...e.de>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, adityag@...ux.ibm.com,
 donettom@...ux.ibm.com, gregkh@...uxfoundation.org, rafael@...nel.org,
 dakr@...nel.org, akpm@...ux-foundation.org, shan.gavin@...il.com
Subject: Re: [PATCH] drivers/base/memory: Avoid overhead from
 for_each_present_section_nr()

On 4/11/25 12:25 AM, David Hildenbrand wrote:
> On 10.04.25 16:12, Oscar Salvador wrote:
>> On Thu, Apr 10, 2025 at 03:55:19PM +0200, Oscar Salvador wrote:
>>> All in all, I think we are better, and the code is slightly simpler?
>>
>> One thing to notice is that maybe we could further improve and leap 'nr'
>> by the number of sections_per_block, so in those scenarios where
>> a memory-block spans multiple sections this could be faster?
> 
> Essentially, when we created a block we could always start with the next section that starts after the block.
> 

I think it's a good point. Tried a quick test on a ARM64 machine whose memory
capacity is 1TB. Leaping 'nr' by 'sections_per_block' improves the performance a bit,
even it's not too much. The time taken by memory_dev_init() drops from 110ms to 100ms.
For the IBM Power9 machine (64GB memory) I have, there are not too much space to be
improved because the time taken by memory_dev_init() is only 10ms. I will post a patch
for review after this patch gets merged, if you agree.

         for_each_present_section_nr(0, nr) {
-               if (block_id != ULONG_MAX && memory_block_id(nr) == block_id)
-                       continue;
-
-               block_id = memory_block_id(nr);
-               ret = add_memory_block(block_id, MEM_ONLINE, NULL, NULL);
+               ret = add_memory_block(memory_block_id(nr), MEM_ONLINE, NULL, NULL);
                 if (ret) {
                         panic("%s() failed to add memory block: %d\n",
                               __func__, ret);
                 }
+
+               /* Align to next block, minus one section */
+               nr = ALIGN(nr + 1, sections_per_block) - 1;
	}

Thanks,
Gavin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ