linux-kernel - Re: RFC: using worker threadpool to speed up clear_huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20160727.221331.1781806974054327287.davem@davemloft.net>
Date:	Wed, 27 Jul 2016 22:13:31 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	kishore.kumar.pusukuri@...cle.com
Cc:	linux-kernel@...r.kernel.org, sparclinux@...r.kernel.org
Subject: Re: RFC: using worker threadpool to speed up clear_huge_page() by
 up to 5x

From: kpusukur <kishore.kumar.pusukuri@...cle.com>
Date: Sun, 17 Jul 2016 12:35:20 -0700

> We would welcome feedback and discussion of potential problems.
> 
> We would also like to hear ideas for other areas in the kernel where a
> similar technique could be employed. For example, we've also applied
> this idea to copy on write operations for huge pages and it achieves
> around 20x speedup.

I don't know about this.

You can only profitably do this when you have enough physical cpu
resources schedulable, and on the same NUMA node.

By the time you compute the complete answer to that entire condition
you could have completed the hugepage clear.

Also, you should experiment with simply using a dedicated hugepage
clear assembler loop for these chips.  It's really stupid to pay the
transaction cost of going in and out of the clear_user_highpage()
function N times per huge page.