lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a1103727ffaaf5f4d1b077bc09a3cc5168c5708d.camel@mediatek.com>
Date: Sat, 21 Jun 2025 10:55:16 +0800
From: Kuyo Chang <kuyo.chang@...iatek.com>
To: Juri Lelli <juri.lelli@...hat.com>
CC: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann
	<dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, "Ben
 Segall" <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, "Valentin
 Schneider" <vschneid@...hat.com>, Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>, jstultz
	<jstultz@...gle.com>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential
 starvation when expiry time passed

On Fri, 2025-06-20 at 17:22 +0200, Juri Lelli wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> On 20/06/25 11:00, Kuyo Chang wrote:
> 
> ...
> 
> > 
> 
> Thanks for the additional explanation.
> 
> The way I understand it now is the following (of course please
> correct
> me if I am still not getting it :)
> 
> - a dl_server is actively servicing NORMAL tasks, but suffers lot of
> IRQ
>   load and cannot make much progress
> - it does anyway make progress, but it reaches
> update_curr_dl_se@...ottle
>   only when its current deadline is past rq_clock
> - dl_runtime_exceeded() branch is entered, but start_dl_timer() fails
> as
>   the computed act is still in the past
> - enqueue_dl_entity(REPLENISH) call replenish_dl_entity() which tries
> to
>   add runtime and advance the deadline, but time moved on so far that
>   deadline is still behind rq_clock() and so "DL replenish ..." is
>   printed
> - replenish_dl_new_period() updates runtime and deadline from current
>   clock and the dl-server is put back to run (so it continues to run
>   over/starve FIFO tasks)
> 

Yes, "DL replenish ..." is the critical clue for identifying the root
cause of this issue.

> It looks like your proposed fix might work in this particular corner
> case, but I am not 100% comfortable with not trying to replenish
> properly (catch up with runtime) at all. I wonder if we might then
> start
> missing some other corner case. Maybe we could try to catch this
> particular corner case before even attempting to start the dl_timer,
> since we know it will fail, and do something at that point?
> 

You can consider the patch more as an error-proofing mechanism, and so
far, it has been working well on our platform.
However, it might be better to catch this particular corner case in
advance to prevent the issue.
> Thanks,
> Juri
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ