lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aH5CJgkXuOxYSSmF@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 21 Jul 2025 15:35:34 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Yuri Andriaccio <yurand2000@...il.com>
Cc: bsegall@...gle.com, dietmar.eggemann@....com,
	linux-kernel@...r.kernel.org, luca.abeni@...tannapisa.it,
	mgorman@...e.de, mingo@...hat.com, peterz@...radead.org,
	rostedt@...dmis.org, vincent.guittot@...aro.org,
	vschneid@...hat.com, yuri.andriaccio@...tannapisa.it
Subject: Re: [BUG] Bw accounting warning on fair-servers' parameters change

On 21/07/25 11:34, Yuri Andriaccio wrote:
> Hi,
> 
> On 18/07/25 16:53, Juri Lelli wrote:
> > On 18/07/25 16:22, Juri Lelli wrote:
> > > Hi,
> > >
> > > Thanks for reporting.
> > >
> > > On 18/07/25 13:38, Yuri Andriaccio wrote:
> > > > Hi,
> > > >
> > > > I've been lately working on fair-servers and dl_servers for some patches and
> > > > I've come across a bandwidth accounting warning on the latest tip/master (as of
> > > > 2025-07-18, git sha ed0272f0675f). The warning is triggered by simply starting
> > > > the machine, mounting debugfs and then just zeroing any fair-server's runtime.
> > > >
> > > >
> > > > The warning:
> > > >
> > > > WARNING: kernel/sched/deadline.c:266 at dl_rq_change_utilization+0x208/0x230
> > > > static inline void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq) {
> > > >     ...
> > > > 	WARN_ON_ONCE(dl_rq->running_bw > dl_rq->this_bw);
> > > > }
> > > >
> > > > Steps to reproduce:
> > > >
> > > > mount -t debugfs none /sys/kernel/debug
> > > > echo 0 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
> > > >
> > > >
> > > > It does not happen at every machine boot, but happens on most. Could it possibly
> > > > be related to some of the deadline timers?
> > >
> > > I took a quick first look and currently suspect cccb45d7c4295
> > > ("sched/deadline: Less agressive dl_server handling") could be playing a
> > > role in this as it delays actual server stop.
> > >
> > > Could you please try to repro after having reverted such commit?
> >
> > After that (w/o the revert), could you please try to see if the
> > following helps?
> 
> I've been performing some tests as you asked and indeed the culprit seems to be
> cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"), as
> reverting it on the current tip removes the issue.
> 
> I've also tested the fix you posted (w/o the reverted commit), and I can confirm
> that the warning does not seem to be triggered anymore.

Thanks!

Sent out a clean-up version

https://lore.kernel.org/lkml/20250721-upstream-fix-dlserver-lessaggressive-b4-v1-1-4ebc10c87e40@redhat.com/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ