lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <07a6772b-711d-4fdc-f688-db76f1ec4c45@oracle.com>
Date:   Fri, 26 May 2017 12:45:55 -0400
From:   Pasha Tatashin <pasha.tatashin@...cle.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-kernel@...r.kernel.org, sparclinux@...r.kernel.org,
        linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
        linux-s390@...r.kernel.org, borntraeger@...ibm.com,
        heiko.carstens@...ibm.com, davem@...emloft.net
Subject: Re: [v3 0/9] parallelized "struct page" zeroing

Hi Michal,

I have considered your proposals:

1. Making memset(0) unconditional inside __init_single_page() is not 
going to work because it slows down SPARC, and ppc64. On SPARC even the 
BSTI optimization that I have proposed earlier won't work, because after 
consulting with other engineers I was told that stores (without loads!) 
after BSTI without membar are unsafe

2. Adding ARCH_WANT_LARGE_PAGEBLOCK_INIT is not going to solve the 
problem, because while arch might want a large memset(), it still wants 
to get the benefit of parallelized struct page initialization.

3. Another approach that have I considered is moving memset() above 
__init_single_page() and do it in a larger chunks. However, this 
solution is also not going to work, because inside the loops, there are 
cases where "struct page"s are skipped, so every single page is checked:
early_pfn_valid(pfn), early_pfn_in_nid(), and also mirroed_kernelcore cases.

> I wouldn't be so sure about this. If any other platform has a similar
> issues with small memset as sparc then the overhead is just papered over
> by parallel initialization.

That is true, and that is fine, because parallelization gives an order 
of magnitude better improvements compared to trade of slower single 
thread performance. Remember, this will happen during boot and memory 
hotplug only, and not something that will eat up computing resources 
during runtime.

So, at the moment I cannot really find a better solution compared to 
what I have proposed: do memset() inside __init_single_page() only when 
deferred initialization is enabled.

Pasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ