lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.22.394.2010151315190.2869@hadrien>
Date:   Thu, 15 Oct 2020 13:38:31 +0200 (CEST)
From:   Julia Lawall <julia.lawall@...ia.fr>
To:     Waiman Long <longman@...hat.com>
cc:     Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Gilles Muller <Gilles.Muller@...ia.fr>
Subject: slowdown due to reader-owned rwsem time-based spinning

Hello,

Phoenix is an implementation of map reduce:

https://github.com/kozyraki/phoenix

The phoenix-2.0/tests subdirectory contains some benchmarks, including
word_count.

At the same time, on my server, since v5.8, the kernel has changed from
using the governor intel_pstate by default to using intel_cpufreq.
Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
while intel_pstate involves very few such stray processes.

Suprisingly, all those kworkers cause the word_count benchmark to run 2-3
times faster.  I bisected the problem back to the following commit, whcih
was introduced in v5.3:

commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
Author: Waiman Long <longman@...hat.com>
Date:   Mon May 20 16:59:13 2019 -0400

    locking/rwsem: Enable time-based spinning on reader-owned rwsem

Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
the one with the kworkers.

I don't know the Phoenix code in detail, but the problem seems to be in
the infrastructure not the specific word count aplication, because most of
the benchmarks seem to suffer similarly.  Some of the other benchmarks
seem to take a variable and long amount of time to get started in the
active mode, so perhaps the problem could be in reading the initial
dataset.

Before I plunge into it, do you have any suggestions as to what could be
the problem?

thanks,
julia
Download attachment "word_count_5.9pwrsvactive_1.pdf" of type "application/pdf" (1511252 bytes)

Download attachment "word_count_5.9pwrsvpassive_1.pdf" of type "application/pdf" (1797989 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ