linux-kernel - Re: [PATCH] mm, sparsemem: break out of loops early

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170505131649.t5ffmg7xspndtrc4@node.shutemov.name>
Date:   Fri, 5 May 2017 16:16:49 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
Cc:     linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        linux-mm@...ck.org, kirill.shutemov@...ux.intel.com
Subject: Re: [PATCH] mm, sparsemem: break out of loops early

On Thu, May 04, 2017 at 10:44:34AM -0700, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@...ux.intel.com>
> 
> There are a number of times that we loop over NR_MEM_SECTIONS,
> looking for section_present() on each section.  But, when we have
> very large physical address spaces (large MAX_PHYSMEM_BITS),
> NR_MEM_SECTIONS becomes very large, making the loops quite long.
> 
> With MAX_PHYSMEM_BITS=46 and a section size of 128MB, the current
> loops are 512k iterations, which we barely notice on modern
> hardware.  But, raising MAX_PHYSMEM_BITS higher (like we will see
> on systems that support 5-level paging) makes this 64x longer and
> we start to notice, especially on slower systems like simulators.
> A 10-second delay for 512k iterations is annoying.  But, a 640-
> second delay is crippling.
> 
> This does not help if we have extremely sparse physical address
> spaces, but those are quite rare.  We expect that most of the
> "slow" systems where this matters will also be quite small and
> non-sparse.
> 
> To fix this, we track the highest section we've ever encountered.
> This lets us know when we will *never* see another
> section_present(), and lets us break out of the loops earlier.
> 
> Doing the whole for_each_present_section_nr() macro is probably
> overkill, but it will ensure that any future loop iterations that
> we grow are more likely to be correct.
> 
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>

Tested-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>

It shaved almost 40 seconds from boot time in qemu with 5-level paging
enabled for me :)

-- 
 Kirill A. Shutemov