lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241120090354.GE19989@noisy.programming.kicks-ass.net>
Date: Wed, 20 Nov 2024 10:03:54 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Chenbo Lu <chenbo.lu@...yaviation.com>
Cc: stable@...r.kernel.org, regressions@...ts.linux.dev, mingo@...hat.com,
	juri.lelli@...hat.com, linux-kernel@...r.kernel.org,
	vschneid@...hat.com
Subject: Re: Performance Degradation After Upgrading to Kernel 6.8

On Tue, Nov 19, 2024 at 04:30:02PM -0800, Chenbo Lu wrote:
> Hello,
> 
> I am experiencing a significant performance degradation after
> upgrading my kernel from version 6.6 to 6.8 and would appreciate any
> insights or suggestions.
> 
> I am running a high-load simulation system that spawns more than 1000
> threads and the overall CPU usage is 30%+ . Most of the threads are
> using real-time
> scheduling (SCHED_RR), and the threads of a model are using
> SCHED_DEADLINE. After upgrading the kernel, I noticed that the
> execution time of my model has increased from 4.5ms to 6ms.
> 
> What I Have Done So Far:
> 1. I found this [bug
> report](https://bugzilla.kernel.org/show_bug.cgi?id=219366#c7) and
> reverted the commit efa7df3e3bb5da8e6abbe37727417f32a37fba47 mentioned
> in the post. Unfortunately, this did not resolve the issue.
> 2. I performed a git bisect and found that after these two commits
> related to scheduling (RT and deadline) were merged, the problem
> happened. They are 612f769edd06a6e42f7cd72425488e68ddaeef0a,
> 5fe7765997b139e2d922b58359dea181efe618f9

And yet you failed to Cc Valentin, the author of said commits :/

> After reverting these two commits, the model execution time improved
> to around 5 ms.
> 3. I revert two more commits, and the execution time is back to 4.7ms:
> 63ba8422f876e32ee564ea95da9a7313b13ff0a1,
> efa7df3e3bb5da8e6abbe37727417f32a37fba47
> 
> My questions are:
> 1.Has anyone else experienced similar performance degradation after
> upgrading to kernel 6.8?

This is 4 kernel releases back, I my memory isn't that long.

> 2.Can anyone explain why these two commits are causing the problem? I
> am not very familiar with the kernel code and would appreciate any
> insights.

There might be a race window between setting the tro and sending the
IPI, such that previously the extra IPIs would sooner find the newly
pushable task.

Valentin, would it make sense to set tro before enqueueing the pushable,
instead of after it?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ