[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241120090354.GE19989@noisy.programming.kicks-ass.net>
Date: Wed, 20 Nov 2024 10:03:54 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Chenbo Lu <chenbo.lu@...yaviation.com>
Cc: stable@...r.kernel.org, regressions@...ts.linux.dev, mingo@...hat.com,
juri.lelli@...hat.com, linux-kernel@...r.kernel.org,
vschneid@...hat.com
Subject: Re: Performance Degradation After Upgrading to Kernel 6.8
On Tue, Nov 19, 2024 at 04:30:02PM -0800, Chenbo Lu wrote:
> Hello,
>
> I am experiencing a significant performance degradation after
> upgrading my kernel from version 6.6 to 6.8 and would appreciate any
> insights or suggestions.
>
> I am running a high-load simulation system that spawns more than 1000
> threads and the overall CPU usage is 30%+ . Most of the threads are
> using real-time
> scheduling (SCHED_RR), and the threads of a model are using
> SCHED_DEADLINE. After upgrading the kernel, I noticed that the
> execution time of my model has increased from 4.5ms to 6ms.
>
> What I Have Done So Far:
> 1. I found this [bug
> report](https://bugzilla.kernel.org/show_bug.cgi?id=219366#c7) and
> reverted the commit efa7df3e3bb5da8e6abbe37727417f32a37fba47 mentioned
> in the post. Unfortunately, this did not resolve the issue.
> 2. I performed a git bisect and found that after these two commits
> related to scheduling (RT and deadline) were merged, the problem
> happened. They are 612f769edd06a6e42f7cd72425488e68ddaeef0a,
> 5fe7765997b139e2d922b58359dea181efe618f9
And yet you failed to Cc Valentin, the author of said commits :/
> After reverting these two commits, the model execution time improved
> to around 5 ms.
> 3. I revert two more commits, and the execution time is back to 4.7ms:
> 63ba8422f876e32ee564ea95da9a7313b13ff0a1,
> efa7df3e3bb5da8e6abbe37727417f32a37fba47
>
> My questions are:
> 1.Has anyone else experienced similar performance degradation after
> upgrading to kernel 6.8?
This is 4 kernel releases back, I my memory isn't that long.
> 2.Can anyone explain why these two commits are causing the problem? I
> am not very familiar with the kernel code and would appreciate any
> insights.
There might be a race window between setting the tro and sending the
IPI, such that previously the extra IPIs would sooner find the newly
pushable task.
Valentin, would it make sense to set tro before enqueueing the pushable,
instead of after it?
Powered by blists - more mailing lists