linux-kernel - [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180606122731.GB27707@jra-laptop.brq.redhat.com>
Date:   Wed, 6 Jun 2018 14:27:32 +0200
From:   Jakub Racek <jracek@...hat.com>
To:     linux-kernel@...r.kernel.org
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>, linux-acpi@...r.kernel.org,
        jracek@...hat.com
Subject: [4.17 regression] Performance drop on kernel-4.17 visible on Stream,
 Linpack and NAS parallel benchmarks

Hi,

There is a huge performance regression on the 2 and 4 NUMA node systems on stream 
benchmark with 4.17 kernel compared to 4.16 kernel. 
Stream, Linpack and NAS parallel benchmarks show upto 50% performance drop.

When running for example 20 stream processes in parallel, we see the following behavior:

* all processes are started at NODE #1
* memory is also allocated on NODE #1
* roughly half of the processes are moved to the NODE #0 very quickly. 
* however, memory is not moved to NODE #0 and stays allocated on NODE #1

As the result, half of the processes are running on NODE#0 with memory being still 
allocated on NODE#1. This leads to non-local memory accesses
on the high Remote-To-Local Memory Access Ratio on the numatop charts.  

So it seems that 4.17 is not doing a good job to move the memory to the right NUMA
node after the process has been moved.

----8<----

The above is an excerpt from performance testing on 4.16 and 4.17 kernels.

For now I'm merely making sure the problem is reported.

Thank you.

Best regards,
Jakub Racek