linux-kernel - Re: [PATCH] thp: reduce usage of huge zero page's atomic counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20160829215607.8c1b6912a837ae580f571dd2@linux-foundation.org>
Date:   Mon, 29 Aug 2016 21:56:07 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Cc:     Aaron Lu <aaron.lu@...el.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        "'Kirill A. Shutemov'" <kirill.shutemov@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Huang Ying <ying.huang@...el.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Jerome Marchand <jmarchan@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Ebru Akagunduz <ebru.akagunduz@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thp: reduce usage of huge zero page's atomic counter

On Tue, 30 Aug 2016 10:14:25 +0530 Anshuman Khandual <khandual@...ux.vnet.ibm.com> wrote:

> On 08/30/2016 09:09 AM, Andrew Morton wrote:
> > On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu <aaron.lu@...el.com> wrote:
> > 
> >>>> Case used for test on Haswell EP:
> >>>> usemem -n 72 --readonly -j 0x200000 100G
> >>>> Which spawns 72 processes and each will mmap 100G anonymous space and
> >>>> then do read only access to that space sequentially with a step of 2MB.
> >>>>
> >>>> perf report for base commit:
> >>>>     54.03%  usemem   [kernel.kallsyms]   [k] get_huge_zero_page
> >>>> perf report for this commit:
> >>>>      0.11%  usemem   [kernel.kallsyms]   [k] mm_get_huge_zero_page
> >>>
> >>> Does this mean that overall usemem runtime halved?
> >>
> >> Sorry for the confusion, the above line is extracted from perf report.
> >> It shows the percent of CPU cycles executed in a specific function.
> >>
> >> The above two perf lines are used to show get_huge_zero_page doesn't
> >> consume that much CPU cycles after applying the patch.
> >>
> >>>
> >>> Do we have any numbers for something which is more real-wordly?
> >>
> >> Unfortunately, no real world numbers.
> >>
> >> We think the global atomic counter could be an issue for performance
> >> so I'm trying to solve the problem.
> > 
> > So, umm, we don't actually know if the patch is useful to anyone?
> 
> On a POWER system it improves the CPU consumption of the above mentioned
> function a little bit. Dont think its going to improve actual throughput
> of the workload substantially.
> 
> 0.07%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page
> 
> to
> 
> 0.01%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page

I can't say I'm surprised really.  A huge page is, ahem, huge.  The
computational cost of actually writing stuff into that page will swamp
the cost of the locking to acquire it.

Is the patch really worth the additional complexity?