[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <089882f95b1b910f7feecddd0ad9b17f38394c64.camel@mediatek.com>
Date: Wed, 18 Jun 2025 22:20:23 +0800
From: Kuyo Chang <kuyo.chang@...iatek.com>
To: Juri Lelli <juri.lelli@...hat.com>
CC: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, "Ben
Segall" <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, "Valentin
Schneider" <vschneid@...hat.com>, Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>, jstultz
<jstultz@...gle.com>, <linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential
starvation when expiry time passed
On Mon, 2025-06-16 at 17:03 +0200, Juri Lelli wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>
>
> Hello,
>
> >
> > [Proposed Solution]:
> > ------------------
> > Instead of immediately re-enqueuing the DL entity on timer
> > registration
> > failure, this change ensures the DL entity is properly replenished
> > and
> > the timer is restarted, preventing RT potential starvation.
> >
> > Signed-off-by: kuyo chang <kuyo.chang@...iatek.com>
> > ---
> > kernel/sched/deadline.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index ad45a8fea245..e50cb76c961b 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1556,10 +1556,12 @@ static void update_curr_dl_se(struct rq
> > *rq, struct sched_dl_entity *dl_se, s64
> > }
> >
> > if (unlikely(is_dl_boosted(dl_se) ||
> > !start_dl_timer(dl_se))) {
> > - if (dl_server(dl_se))
> > - enqueue_dl_entity(dl_se,
> > ENQUEUE_REPLENISH);
> > - else
> > + if (dl_server(dl_se)) {
> > + replenish_dl_new_period(dl_se, rq);
> > + start_dl_timer(dl_se);
>
> But, even today, enqueue_dl_entity() is called with ENQUEUE_REPLENISH
> flag, so I don't get why you say 're-enqueues the DL entity without
> properly replenishing'.
>
> Also, why restarting the replenishing timer right after having
> replenished the entity?
>
When dl_defer_running = 1 and the running time has been exhausted,
it means that the dl_server should stop at this point.
However, if start_dl_timer() returns a failure, it indicates that the
actual time spent consuming the running time was unexpectedly long.
At this point, there are two options:
[as-is] 1. re-enqueuing the dl entity with ENQUEUE_REPLENISH will clear
the throttled flag
and re-enqueue the dl entity to keep the fair_server running.
enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
=> replenish_dl_entity
=> replenish_dl_new_period(dl_se, rq);
=> dl_se->dl_yielded = 0;
=> dl_se->dl_throttled = 0;
=> __enqueue_dl_entity(dl_se);
[to-be] 2. To avoid RT latency, the fair_server should remain throttled
while replenishing the dl_se.
Once replenishing is complete, we can ensure that a timer is
successfully started.
When the timer is triggered, the throttled state will be cleared,
ensuring that RT tasks can execute during this interval.
It is a policy decision for dealing with the case of failure in
start_dl_timer().
The second approach is better for real-time (RT) latency in my opinion,
as RT tasks must be prioritized.
> Thanks,
> Juri
>
Powered by blists - more mailing lists