linux-kernel - Re: [PATCH] nohz: nohz idle balancing per node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1625181736.w1x011tuds.astroid@bobo.none>
Date:   Fri, 02 Jul 2021 09:33:45 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Mel Gorman <mgorman@...e.de>, Peter Zijlstra <peterz@...radead.org>
Cc:     Daniel Bristot de Oliveira <bristot@...hat.com>,
        Ben Segall <bsegall@...gle.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Vaidyanathan Srinivasan <svaidy@...ux.ibm.com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [PATCH] nohz: nohz idle balancing per node

Excerpts from Mel Gorman's message of July 1, 2021 11:11 pm:
> On Thu, Jul 01, 2021 at 12:18:18PM +0200, Peter Zijlstra wrote:
>> On Thu, Jul 01, 2021 at 03:53:23PM +1000, Nicholas Piggin wrote:
>> > Currently a single nohz idle CPU is designated to perform balancing on
>> > behalf of all other nohz idle CPUs in the system. Implement a per node
>> > nohz balancer to minimize cross-node memory accesses and runqueue lock
>> > acquisitions.
>> > 
>> > On a 4 node system, this improves performance by 9.3% on a 'pgbench -N'
>> > with 32 clients/jobs (which is about where throughput maxes out due to
>> > IO and contention in postgres).
>> 
>> Hmm, Suresh tried something like this around 2010 and then we ran into
>> trouble that when once node went completely idle and another node was
>> fully busy, the completely idle node would not run ILB and the node
>> would forever stay idle.
>> 
> 
> An effect like that *might* be visible at
> https://beta.suse.com/private/mgorman/melt/v5.13/3-perf-test/sched/sched-nohznuma-v1r1/html/network-tbench/hardy2/
> at the CPU usage heatmaps ordered by topology at the very bottom of
> the page.
> 
> The heatmap covers all client counts so there are "blocks" of activity for
> each client count tested. The third block is for 8 thread counts so a node
> is not fully busy yet.

I'm not sure what I'm looking at. Where are these blocks? Along the x 
axis?

> However, with the vanilla kernel, there is some
> load on each node but with the patch all the load is on one node. This
> did not happen on the two other test machines so the observation is not
> reliable and could be a total coincidence.

tbench is pretty finicky so it could be.

> 
> That said, there were some gains but large losses depending on the client
> count across the 3 machines for tbench which is a concern. Other results,
> like pgbench mentioned in the changelog, will not complete until tomorrow
> to see if it is a general pattern or tbench-specific.
> 
> https://beta.suse.com/private/mgorman/melt/v5.13/3-perf-test/sched/sched-nohznuma-v1r1/html/network-tbench/bing2/
> https://beta.suse.com/private/mgorman/melt/v5.13/3-perf-test/sched/sched-nohznuma-v1r1/html/network-tbench/hardy2/
> https://beta.suse.com/private/mgorman/melt/v5.13/3-perf-test/sched/sched-nohznuma-v1r1/html/network-tbench/marvin2/

All 2-node. How many runs does it do at each clinet count? There's a big 
regression at one clinet with one of them, but the other two have small 
gains.

Thanks,
Nick