[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSW2ACoDta8j3S7E@jlelli-thinkpadt14gen4.remote.csb>
Date: Tue, 25 Nov 2025 13:58:24 +0000
From: Juri Lelli <juri.lelli@...hat.com>
To: David Haufe <dhaufe@...plextrading.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Kernel 6.14.11 dl_server_timer(...) causing IPI/Function Call
Interrupts on isolcpu/nohz_full cores, performance regression
Hi David,
On 24/11/25 14:30, David Haufe wrote:
> Hi Juri,
> Working with 6.17.7 (code appears the same in .8 and 6.18rc), we are
> still having isolated/nohz cores interrupted by deadline
> functionality. The dl_task_timer is firing once a second with a single
> SCHED_OTHER process spinning on the core. I am no longer seeing the
> IPI interrupts, but the hrtime activity is still causing a performance
> regression compared to kernels prior to the deadline merge.
Where you able to check if the changes on the branch below made things
any better? It is hopefully not that hard to apply to newer/different
baselines.
I have to apologize in advance, but I'm going to be on pto for a few
days and then I have to travel for LPC26. So, unfortunately I am going
to have very limited availability until end of the year.
Best,
Juri
> On Mon, Aug 4, 2025 at 11:59 AM Juri Lelli <juri.lelli@...hat.com> wrote:
> >
> > On 04/08/25 10:44, David Haufe wrote:
> > > My apologies, I see what you mean now. add_nr_running() is not being
> > > invoked if it is the dl_server. We are still trying to get this branch
> > > to boot to verify ourselves. We will be on the lookout for this to be
> > > merged for release.
> >
> > No worries, guess I wasn't clear the first time. :)
> >
> > I added a very much experimental commit on
> >
> > https://github.com/jlelli/linux/tree/upstream/fix-dlserver-1
> >
> > that seems to be able to remove the one per second dl_server_timer and
> > start it back as needed. But, I just played briefly with it, so I am not
> > fully convinced is what we want. Anyway, if you could test with it as
> > well it would be a useful data point. In principle you could try porting
> > the following commits to your current tree and check if they do improve
> > things (in reverse order starting from the bottom from the branch above):
> >
> > f237e524f3c7 ("sched/deadline: Make dl-server nohz full aware")
> > 219a63335b67 ("sched/deadline: Don't count nr_running twice for dl_server proxy tasks")
> > 7620177e8108 ("sched/deadline: Fix RT task potential starvation when expiry time passed")
> > cccb45d7c429 ("sched/deadline: Less agressive dl_server handling")
Powered by blists - more mailing lists