lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Aug 2016 21:56:07 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Cc:     Aaron Lu <aaron.lu@...el.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        "'Kirill A. Shutemov'" <kirill.shutemov@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Huang Ying <ying.huang@...el.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Jerome Marchand <jmarchan@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Ebru Akagunduz <ebru.akagunduz@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thp: reduce usage of huge zero page's atomic counter

On Tue, 30 Aug 2016 10:14:25 +0530 Anshuman Khandual <khandual@...ux.vnet.ibm.com> wrote:

> On 08/30/2016 09:09 AM, Andrew Morton wrote:
> > On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu <aaron.lu@...el.com> wrote:
> > 
> >>>> Case used for test on Haswell EP:
> >>>> usemem -n 72 --readonly -j 0x200000 100G
> >>>> Which spawns 72 processes and each will mmap 100G anonymous space and
> >>>> then do read only access to that space sequentially with a step of 2MB.
> >>>>
> >>>> perf report for base commit:
> >>>>     54.03%  usemem   [kernel.kallsyms]   [k] get_huge_zero_page
> >>>> perf report for this commit:
> >>>>      0.11%  usemem   [kernel.kallsyms]   [k] mm_get_huge_zero_page
> >>>
> >>> Does this mean that overall usemem runtime halved?
> >>
> >> Sorry for the confusion, the above line is extracted from perf report.
> >> It shows the percent of CPU cycles executed in a specific function.
> >>
> >> The above two perf lines are used to show get_huge_zero_page doesn't
> >> consume that much CPU cycles after applying the patch.
> >>
> >>>
> >>> Do we have any numbers for something which is more real-wordly?
> >>
> >> Unfortunately, no real world numbers.
> >>
> >> We think the global atomic counter could be an issue for performance
> >> so I'm trying to solve the problem.
> > 
> > So, umm, we don't actually know if the patch is useful to anyone?
> 
> On a POWER system it improves the CPU consumption of the above mentioned
> function a little bit. Dont think its going to improve actual throughput
> of the workload substantially.
> 
> 0.07%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page
> 
> to
> 
> 0.01%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page

I can't say I'm surprised really.  A huge page is, ahem, huge.  The
computational cost of actually writing stuff into that page will swamp
the cost of the locking to acquire it.

Is the patch really worth the additional complexity?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ