lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c057bbc383ab5525172b22a4f3480acbcac12045.camel@mediatek.com>
Date: Fri, 27 Jun 2025 11:02:37 +0800
From: Kuyo Chang <kuyo.chang@...iatek.com>
To: John Stultz <jstultz@...gle.com>
CC: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
	<vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, "Mel
 Gorman" <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, Matthias
 Brugger <matthias.bgg@...il.com>, AngeloGioacchino Del Regno
	<angelogioacchino.delregno@...labora.com>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: Re: [PATCH v3 1/1] sched/deadline: Fix dl_server runtime
 calculation formula

On Wed, 2025-06-25 at 22:25 -0700, John Stultz wrote:
> 
> 
> 
> On Wed, Jun 25, 2025 at 8:08 PM Kuyo Chang <kuyo.chang@...iatek.com>
> wrote:
> > [Symptom]
> > The calculation formula for dl_server runtime is based on
> > Frequency/capacity scale-invariance.
> > This will cause excessive RT latency (expect absolute time).
> 
> I appreciate your symptom/analysis/solution markers, but they are a
> little unconventional for commit messages, so you might consider
> dropping them.
> 
> Further, you're explaining the cause of the problem before you
> describe what you were seeing.
> 
> I might suggest rewording as:
> "In our testing with 6.12 based kernel on a big.LITTLE system, we
> were
> seeing instances of RT tasks being blocked from running on the LITTLE
> cpus for multiple seconds of time, apparently by the dl_server. This
> far exceeds the default configured 50ms per second runtime."
> 
> Then you can get into the analysis where you can state as you had
> above:
> "This is due to the fair dl_server runtime calculation being scaled
> for frequency & capacity of the cpu"
> 
> > [Analysis]
> > Consider the following case under a Big.LITTLE architecture:
> > 
> > Assume the runtime is: 50,000,000 ns, and Frequency/capacity
> > scale-invariance defined as below:
> > 
> > Frequency scale-invariance: 100
> > Capacity scale-invariance: 50
> > First by Frequency scale-invariance,
> > the runtime is scaled to 50,000,000 * 100 >> 10 = 4,882,812
> > Then by capacity scale-invariance,
> > it is further scaled to 4,882,812 * 50 >> 10 = 238,418.
> > 
> > So it will scaled to 238,418 ns.
> 
> This part is all good, but you might want to make it more clear why
> the smaller scaled "runtime" value here results in longer delays for
> RT tasks.
> ie:
> 
> This smaller "accounted runtime" value is what ends up being
> subtracted against the fair-server's runtime for the current period.
> Thus after 50ms of real time, we've only accounted ~238us against the
> fair servers runtime. This 209:1 ratio in this example means that on
> the smaller cpu the fair server is allowed to continue running,
> blocking RT tasks, for over 10 seconds before it exhausts its
> supposed
> 50ms of runtime.  And on other hardware configurations it can be even
> worse.
> 
> > [Solution]
> > The runtime for dl_server should be fixed time
> > asis RT bandwidth control.
> > Fix the runtime calculation formula for the dl_server.
> 
> I think this can be stated more simply as:
> "For the fair deadline_server, to prevent realtime tasks from being
> unexpectedly delayed, we really do want to use fixed time, and not
> scaled time for smaller capacity/frequency cpus. So remove the
> scaling
> from the fair server's accounting to fix this."
> 
> > Signed-off-by: kuyo chang <kuyo.chang@...iatek.com>
> > Acked-by: Juri Lelli <juri.lelli@...hat.com>
> > Suggested-by: Peter Zijlstra <peterz@...radead.org>
> > Suggested-by: John Stultz <jstultz@...gle.com>
> > Tested-by: John Stultz <jstultz@...gle.com>
> > 
> > v1:
> > https://lore.kernel.org/all/20250614020524.631521-1-kuyo.chang@mediatek.com/
> > v2:
> > https://lore.kernel.org/all/20250617155355.1479777-1-kuyo.chang@mediatek.com/
> 
> Please add a "---" line above the version data so it doesn't get
> added
> to the commit history.
> 
> > Change-Id: Iaaa1ec78f405586c22ba56230ef1143206daa2d0
> 
> Also no Change-Ids in upstream targeted commits.
> ./scripts/checkpatch.pl should warn you about this, and its a good
> idea in general to run that before submitting.
> 
> > ---
> >  kernel/sched/deadline.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index ad45a8fea245..96a21f38fcc3 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1504,7 +1504,9 @@ static void update_curr_dl_se(struct rq *rq,
> > struct sched_dl_entity *dl_se, s64
> >         if (dl_entity_is_special(dl_se))
> >                 return;
> > 
> > -       scaled_delta_exec = dl_scaled_delta_exec(rq, dl_se,
> > delta_exec);
> > +       scaled_delta_exec = delta_exec;
> > +       if (!dl_server(dl_se))
> > +               scaled_delta_exec = dl_scaled_delta_exec(rq, dl_se,
> > delta_exec);
> > 
> >         dl_se->runtime -= scaled_delta_exec;
> > 
> > @@ -1624,7 +1626,9 @@ void dl_server_update_idle_time(struct rq
> > *rq, struct task_struct *p)
> >         if (delta_exec < 0)
> >                 return;
> > 
> > -       scaled_delta_exec = dl_scaled_delta_exec(rq, &rq-
> > >fair_server, delta_exec);
> > +       scaled_delta_exec = delta_exec;
> > +       if (!rq->fair_server.dl_server)
> > +               scaled_delta_exec = dl_scaled_delta_exec(rq, &rq-
> > >fair_server, delta_exec);
> 
> I still feel like these lines don't make much sense, and it seems
> like
> they could be dropped.  When is (!rq->fair_server.dl_server) true?
> 
> >         rq->fair_server.runtime -= scaled_delta_exec;
> > 
> > --
> 
> Thanks so much again for finding this issue and resubmitting this
> patch! I really do want to see this land quickly as it addresses a
> pretty big issue.

Thanks for your reminder and suggestion.
Update to v4 as below link :)

https://lore.kernel.org/all/20250627022837.3331827-1-kuyo.chang@mediatek.com/

> -john


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ