lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170512.133742.2144484253675877904.davem@davemloft.net>
Date:   Fri, 12 May 2017 13:37:42 -0400 (EDT)
From:   David Miller <davem@...emloft.net>
To:     pasha.tatashin@...cle.com
Cc:     mhocko@...nel.org, linux-kernel@...r.kernel.org,
        sparclinux@...r.kernel.org, linux-mm@...ck.org,
        linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
        borntraeger@...ibm.com, heiko.carstens@...ibm.com
Subject: Re: [v3 0/9] parallelized "struct page" zeroing

From: Pasha Tatashin <pasha.tatashin@...cle.com>
Date: Fri, 12 May 2017 13:24:52 -0400

> Right now it is larger, but what I suggested is to add a new optimized
> routine just for this case, which would do STBI for 64-bytes but
> without membar (do membar at the end of memmap_init_zone() and
> deferred_init_memmap()
> 
> #define struct_page_clear(page)                                 \
>         __asm__ __volatile__(                                   \
>         "stxa   %%g0, [%0]%2\n"                                 \
>         "stxa   %%xg0, [%0 + %1]%2\n"                           \
>         : /* No output */                                       \
>         : "r" (page), "r" (0x20), "i"(ASI_BLK_INIT_QUAD_LDD_P))
> 
> And insert it into __init_single_page() instead of memset()
> 
> The final result is 4.01s/T which is even faster compared to current
> 4.97s/T

Ok, indeed, that would work.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ