lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Jun 2018 14:27:32 +0200
From:   Jakub Racek <jracek@...hat.com>
To:     linux-kernel@...r.kernel.org
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>, linux-acpi@...r.kernel.org,
        jracek@...hat.com
Subject: [4.17 regression] Performance drop on kernel-4.17 visible on Stream,
 Linpack and NAS parallel benchmarks

Hi,

There is a huge performance regression on the 2 and 4 NUMA node systems on stream 
benchmark with 4.17 kernel compared to 4.16 kernel. 
Stream, Linpack and NAS parallel benchmarks show upto 50% performance drop.

When running for example 20 stream processes in parallel, we see the following behavior:

* all processes are started at NODE #1
* memory is also allocated on NODE #1
* roughly half of the processes are moved to the NODE #0 very quickly. 
* however, memory is not moved to NODE #0 and stays allocated on NODE #1

As the result, half of the processes are running on NODE#0 with memory being still 
allocated on NODE#1. This leads to non-local memory accesses
on the high Remote-To-Local Memory Access Ratio on the numatop charts.  

So it seems that 4.17 is not doing a good job to move the memory to the right NUMA
node after the process has been moved.

----8<----

The above is an excerpt from performance testing on 4.16 and 4.17 kernels.

For now I'm merely making sure the problem is reported.

Thank you.

Best regards,
Jakub Racek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ