[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a1103727ffaaf5f4d1b077bc09a3cc5168c5708d.camel@mediatek.com>
Date: Sat, 21 Jun 2025 10:55:16 +0800
From: Kuyo Chang <kuyo.chang@...iatek.com>
To: Juri Lelli <juri.lelli@...hat.com>
CC: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, "Ben
Segall" <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, "Valentin
Schneider" <vschneid@...hat.com>, Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>, jstultz
<jstultz@...gle.com>, <linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential
starvation when expiry time passed
On Fri, 2025-06-20 at 17:22 +0200, Juri Lelli wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>
>
> On 20/06/25 11:00, Kuyo Chang wrote:
>
> ...
>
> >
>
> Thanks for the additional explanation.
>
> The way I understand it now is the following (of course please
> correct
> me if I am still not getting it :)
>
> - a dl_server is actively servicing NORMAL tasks, but suffers lot of
> IRQ
> load and cannot make much progress
> - it does anyway make progress, but it reaches
> update_curr_dl_se@...ottle
> only when its current deadline is past rq_clock
> - dl_runtime_exceeded() branch is entered, but start_dl_timer() fails
> as
> the computed act is still in the past
> - enqueue_dl_entity(REPLENISH) call replenish_dl_entity() which tries
> to
> add runtime and advance the deadline, but time moved on so far that
> deadline is still behind rq_clock() and so "DL replenish ..." is
> printed
> - replenish_dl_new_period() updates runtime and deadline from current
> clock and the dl-server is put back to run (so it continues to run
> over/starve FIFO tasks)
>
Yes, "DL replenish ..." is the critical clue for identifying the root
cause of this issue.
> It looks like your proposed fix might work in this particular corner
> case, but I am not 100% comfortable with not trying to replenish
> properly (catch up with runtime) at all. I wonder if we might then
> start
> missing some other corner case. Maybe we could try to catch this
> particular corner case before even attempting to start the dl_timer,
> since we know it will fail, and do something at that point?
>
You can consider the patch more as an error-proofing mechanism, and so
far, it has been working well on our platform.
However, it might be better to catch this particular corner case in
advance to prevent the issue.
> Thanks,
> Juri
>
Powered by blists - more mailing lists