lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170815123636.3788230c@redhat.com>
Date:   Tue, 15 Aug 2017 12:36:36 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Kemi Wang <kemi.wang@...el.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Johannes Weiner <hannes@...xchg.org>,
        Dave <dave.hansen@...ux.intel.com>,
        Andi Kleen <andi.kleen@...el.com>,
        Ying Huang <ying.huang@...el.com>,
        Aaron Lu <aaron.lu@...el.com>, Tim Chen <tim.c.chen@...el.com>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>, brouer@...hat.com
Subject: Re: [PATCH 0/2] Separate NUMA statistics from zone statistics

On Tue, 15 Aug 2017 16:45:34 +0800
Kemi Wang <kemi.wang@...el.com> wrote:

> Each page allocation updates a set of per-zone statistics with a call to
> zone_statistics(). As discussed in 2017 MM submit, these are a substantial
                                             ^^^^^^ should be "summit"
> source of overhead in the page allocator and are very rarely consumed. This
> significant overhead in cache bouncing caused by zone counters (NUMA
> associated counters) update in parallel in multi-threaded page allocation
> (pointed out by Dave Hansen).

Hi Kemi

Thanks a lot for following up on this work. A link to the MM summit slides:
 http://people.netfilter.org/hawk/presentations/MM-summit2017/MM-summit2017-JesperBrouer.pdf

> To mitigate this overhead, this patchset separates NUMA statistics from
> zone statistics framework, and update NUMA counter threshold to a fixed
> size of 32765, as a small threshold greatly increases the update frequency
> of the global counter from local per cpu counter (suggested by Ying Huang).
> The rationality is that these statistics counters don't need to be read
> often, unlike other VM counters, so it's not a problem to use a large
> threshold and make readers more expensive.
> 
> With this patchset, we see 26.6% drop of CPU cycles(537-->394, see below)
> for per single page allocation and reclaim on Jesper's page_bench03
> benchmark. Meanwhile, this patchset keeps the same style of virtual memory
> statistics with little end-user-visible effects (see the first patch for
> details), except that the number of NUMA items in each cpu
> (vm_numa_stat_diff[]) is added to zone->vm_numa_stat[] when a user *reads*
> the value of NUMA counter to eliminate deviation.

I'm very happy to see that you found my kernel module for benchmarking useful :-)

> I did an experiment of single page allocation and reclaim concurrently
> using Jesper's page_bench03 benchmark on a 2-Socket Broadwell-based server
> (88 processors with 126G memory) with different size of threshold of pcp
> counter.
> 
> Benchmark provided by Jesper D Broucer(increase loop times to 10000000):
                                 ^^^^^^^
You mis-spelled my last name, it is "Brouer".

> https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/mm/bench
> 
>    Threshold   CPU cycles    Throughput(88 threads)
>       32        799         241760478
>       64        640         301628829
>       125       537         358906028 <==> system by default
>       256       468         412397590
>       512       428         450550704
>       4096      399         482520943
>       20000     394         489009617
>       30000     395         488017817
>       32765     394(-26.6%) 488932078(+36.2%) <==> with this patchset
>       N/A       342(-36.3%) 562900157(+56.8%) <==> disable zone_statistics
> 
> Kemi Wang (2):
>   mm: Change the call sites of numa statistics items
>   mm: Update NUMA counter threshold size
> 
>  drivers/base/node.c    |  22 ++++---
>  include/linux/mmzone.h |  25 +++++---
>  include/linux/vmstat.h |  33 ++++++++++
>  mm/page_alloc.c        |  10 +--
>  mm/vmstat.c            | 162 +++++++++++++++++++++++++++++++++++++++++++++++--
>  5 files changed, 227 insertions(+), 25 deletions(-)
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ