[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aH5CJgkXuOxYSSmF@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 21 Jul 2025 15:35:34 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Yuri Andriaccio <yurand2000@...il.com>
Cc: bsegall@...gle.com, dietmar.eggemann@....com,
linux-kernel@...r.kernel.org, luca.abeni@...tannapisa.it,
mgorman@...e.de, mingo@...hat.com, peterz@...radead.org,
rostedt@...dmis.org, vincent.guittot@...aro.org,
vschneid@...hat.com, yuri.andriaccio@...tannapisa.it
Subject: Re: [BUG] Bw accounting warning on fair-servers' parameters change
On 21/07/25 11:34, Yuri Andriaccio wrote:
> Hi,
>
> On 18/07/25 16:53, Juri Lelli wrote:
> > On 18/07/25 16:22, Juri Lelli wrote:
> > > Hi,
> > >
> > > Thanks for reporting.
> > >
> > > On 18/07/25 13:38, Yuri Andriaccio wrote:
> > > > Hi,
> > > >
> > > > I've been lately working on fair-servers and dl_servers for some patches and
> > > > I've come across a bandwidth accounting warning on the latest tip/master (as of
> > > > 2025-07-18, git sha ed0272f0675f). The warning is triggered by simply starting
> > > > the machine, mounting debugfs and then just zeroing any fair-server's runtime.
> > > >
> > > >
> > > > The warning:
> > > >
> > > > WARNING: kernel/sched/deadline.c:266 at dl_rq_change_utilization+0x208/0x230
> > > > static inline void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq) {
> > > > ...
> > > > WARN_ON_ONCE(dl_rq->running_bw > dl_rq->this_bw);
> > > > }
> > > >
> > > > Steps to reproduce:
> > > >
> > > > mount -t debugfs none /sys/kernel/debug
> > > > echo 0 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
> > > >
> > > >
> > > > It does not happen at every machine boot, but happens on most. Could it possibly
> > > > be related to some of the deadline timers?
> > >
> > > I took a quick first look and currently suspect cccb45d7c4295
> > > ("sched/deadline: Less agressive dl_server handling") could be playing a
> > > role in this as it delays actual server stop.
> > >
> > > Could you please try to repro after having reverted such commit?
> >
> > After that (w/o the revert), could you please try to see if the
> > following helps?
>
> I've been performing some tests as you asked and indeed the culprit seems to be
> cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"), as
> reverting it on the current tip removes the issue.
>
> I've also tested the fix you posted (w/o the reverted commit), and I can confirm
> that the warning does not seem to be triggered anymore.
Thanks!
Sent out a clean-up version
https://lore.kernel.org/lkml/20250721-upstream-fix-dlserver-lessaggressive-b4-v1-1-4ebc10c87e40@redhat.com/
Powered by blists - more mailing lists