lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sfdgr55p.ffs@tglx>
Date:   Tue, 04 Apr 2023 00:07:30 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ravi Bangoria <ravi.bangoria@....com>,
        Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Jiri Olsa <jolsa@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH] perf: Optimize perf_pmu_migrate_context()

On Mon, Apr 03 2023 at 11:08, Peter Zijlstra wrote:
> Thomas reported that offlining CPUs spends a lot of time in
> synchronize_rcu() as called from perf_pmu_migrate_context() even though
> he's not actually using uncore events.

That happens when offlining CPUs from a socket > 0 in the same order how
those CPUs have been brought up. On socket 0 this is not observable
unless the bogus CPU0 offlining hack is enabled.

If the offlining happens in the reverse order then all is shiny.

The reason is that the first online CPU on a socket gets the uncore
events assigned and when it is offlined then those are moved to the next
online CPU in the same socket.

On a SKL-X with 56 threads per sockets this results in a whopping _1_
second delay per thread (except for the last one which shuts down the
per socket uncore events with no delay because there are no users) due
to 62 times of pointless synchronize_rcu() invocations where each takes
~16ms on a HZ=250 kernel.

Which in turn is interesting because that machine is completely idle
other than running the offline muck...

> Turns out, the thing is unconditionally waiting for RCU, even if there's
> no actual events to migrate.
>
> Fixes: 0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()")
> Reported-by: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Tested-by: Thomas Gleixner <tglx@...utronix.de>

Reviewed-by: Thomas Gleixner <tglx@...utronix.de>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ