[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250704143612.998419-1-vincent.guittot@linaro.org>
Date: Fri, 4 Jul 2025 16:36:06 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: mingo@...hat.com,
peterz@...radead.org,
juri.lelli@...hat.com,
dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
vschneid@...hat.com,
dhaval@...nis.ca,
linux-kernel@...r.kernel.org
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH v2 0/6] sched/fair: Manage lag and run to parity with different slices
This follows the attempt to better track maximum lag of tasks in presence
of different slices duration:
[1] https://lore.kernel.org/all/20250418151225.3006867-1-vincent.guittot@linaro.org/
Since v1, tracking of the max slice has been removed from the patchset
because we now ensure that the lag of an entity remains in the range of:
[-(slice + tick) : (slice + tick)] with run_to_parity
and
[max(-slice, -(0.7+tick) : max(slice , (0.7+tick)] without run to parity
As a result, there is no need the max slice of enqueued entities anymore.
Patch 1 is a simple cleanup to ease following changes.
Patch 2 fixes the lag for NO_RUN_TO_PARITY. It has been put 1st because of
its simplicity. The running task has a minimum protection of 0.7ms before
eevdf looks for another task.
Patch 3 ensures that the protection is canceled only if the waking task
will be selected by pick_task_fair. This case has been mentionned by Peter
will reviewing v1.
Patch 4 modifes the duration of the protection to take into account the
shortest slice of enqueued tasks instead of the slice of the running task.
Patch 5 fixes the case of tasks not being eligible at wakeup or after
migrating but with a shorter slice. We need to update the duration of the
protection to not exceed the lag.
Patch 6 fixes the case of tasks still being eligible after the protected
period but others must run to no exceed lag limit. This has been
highlighted in a test with delayed entities being dequeued with a positive
lag larger than their slice but it can happen for delayed dequeue entity
too.
The patchset has been tested with rt-app on 37 different use cases, some a
simple and should never trigger any problem but have been kept to increase
the test coverage. The tests have been run on dragon rb5 with affinity on
biggest cores. The lag has been checked when we update the entity's lag at
dequeue and every time we check if an entity is eligible.
RUN_TO_PARITY NO_RUN_TO_PARITY
lag error lag_error
mainline 14/37 14/37
+ patch 1-2 14/37 0/37
+ patch 3-5 1/37 0/37
+ patch 6 0/37 0/37
Vincent Guittot (6):
sched/fair: Use protect_slice() instead of direct comparison
sched/fair: Fix NO_RUN_TO_PARITY case
sched/fair: Remove spurious shorter slice preemption
sched/fair: Limit run to parity to the min slice of enqueued entities
sched/fair: Fix entity's lag with run to parity
sched/fair: Always trigger resched at the end of a protected period
kernel/sched/fair.c | 94 ++++++++++++++++++++++++---------------------
1 file changed, 50 insertions(+), 44 deletions(-)
--
2.43.0
Powered by blists - more mailing lists