[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <25b32c05-7155-4290-8e16-64548fd27ce6@huawei.com>
Date: Thu, 22 Jan 2026 15:52:22 +0800
From: "wangtao (EQ)" <wangtao554@...wei.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
CC: K Prateek Nayak <kprateek.nayak@....com>, <mingo@...hat.com>,
<peterz@...radead.org>, <juri.lelli@...hat.com>, <dietmar.eggemann@....com>,
<rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
<vschneid@...hat.com>, <tanghui20@...wei.com>, <zhangqiao22@...wei.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched/eevdf: Update se->vprot in reweight_entity()
Hi Vincent,
Thanks for the feedback.
Vincent Guittot 写道:
> On Wed, 21 Jan 2026 at 10:33, wangtao (EQ) <wangtao554@...wei.com> wrote:
>>
>> Hello Prateek,
>>
>> K Prateek Nayak 写道:
>>> Hello Wang,
>>>
>>> On 1/20/2026 6:01 PM, Wang Tao wrote:
>>>> In the EEVDF framework with Run-to-Parity protection, `se->vprot` is an
>>>> independent variable defining the virtual protection timestamp.
>>>>
>>>> When `reweight_entity()` is called (e.g., via nice/renice), it performs
>>>> the following actions to preserve Lag consistency:
>>>> 1. Scales `se->vlag` based on the new weight.
>>>> 2. Calls `place_entity()`, which recalculates `se->vruntime` based on
>>>> the new weight and scaled lag.
>>>>
>>>> However, the current implementation fails to update `se->vprot`, leading
>>>> to mismatches between the task's actual runtime and its expected duration.
>>>
>>> I don't think that is correct. "vprot" allows for "min_slice" worth of
>>> runtime from the beginning of the pick however, if we do a
>>> set_protect_slice() after reweight, we'll essentially grant another
>>> "min_slice" worth of time from the current "se->vruntime" (or until
>>> deadline if it is sooner) which is not correct.
>>>
>>>>
>>>> This patch fixes the issue by calling `set_protect_slice()` at the end of
>>>> `reweight_entity()`. This ensures that a new, valid protection slice is
>>>> committed based on the updated `vruntime` and the new weight, restoring
>>>> Run-to-Parity consistency immediately after a weight change.
>>>>
>>>> Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption")
>>>> Suggested-by: Zhang Qiao <zhangqiao22@...wei.com>
>>>> Signed-off-by: Wang Tao <wangtao554@...wei.com>
>>>> ---
>>>> kernel/sched/fair.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index e71302282671..bdd8c4e688ae 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -3792,6 +3792,8 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
>>>
>>> At the beginning of reweight_entity, we should first check if
>>> protect_slice() for curr is true or not:
>>>
>>> bool protect_slice = curr && protect_slice(se);
>>>
>>>> if (!curr)
>>>> __enqueue_entity(cfs_rq, se);
>>>> cfs_rq->nr_queued++;
>>>> + if (curr)
>>>> + set_protect_slice(cfs_rq, se);
>>>
>>>
>>> If protect_slice() was true to begin with, we should do:
>>>
>>> if (protect_slice)
>>> se->vprot = min_vruntime(se->vprot, se->deadline);
>>>
>>> This ensures that if our deadline has moved back, we only protect until
>>> the new deadline and the scheduler can re-evaluate after that. If there
>>> was an entity with a shorter slice at the beginning of the pick, the
>>> "vprot" should still reflect the old value that was calculated using
>>> "se->vruntime" at the time of the pick.
>>>
>>
>> Regarding your suggestion, I have a concern.
>> When a task's weight changes, I believe its "vprot" should also change
>> accordingly. If we keep the original "vprot" unchanged, the task's
>> actual runtime will not reach the expected "vprot".
>>
>> For example, if the weight decreases, the "vruntime" increases faster.
>> If "vprot" is not updated, the task will hit the limit much earlier than
>> expected in physical time.
>
> Ok but it's the safest way to stay in the min runnable slice limit so
> you need to demonstrate that its lag will remain below.
> Run to parity already goes close to this limit by not looking at a new
> running entity at every tick so we need to be sure to not break this
> rule.
>
I fully agree that staying within the Lag limit is paramount and we must
ensure the scheduler invariant is preserved.
>>
>> Therefore, I made this modification to ensure the protection slice is
>> valid. I would like to hear your thoughts on this.
>>
>> if (protect_slice) {
>> se->vprot -= se->vruntime;
>> se->vprot = div_s64(se->vprot * se->load.weight, weight);
>> se->vprot += se->vruntime;
>>
>> se->vprot = min_vruntime(se->vprot, se->deadline);
>
> Why not use update_protect_slice() like when a new task with a shorter
> slice is added ?
I looked into reusing update_protect_slice(), but there is a
mathematical hurdle specific to reweight_entity(): the coordinate space
distortion.
In reweight_entity(), se->vruntime is scaled to match the new weight,
but se->vprot remains in the old weight's coordinate system. If we
simply call update_protect_slice() or compare them directly, we are
comparing values with different virtual time speeds, which leads to
incorrect truncation.
>
>> }
>>
>> Best Regards,
>> Tao
>
To address your safety concerns while fixing the coordinate mismatch, I
have implemented some changes:
Capture: Calculate the remaining physical slice before se->vruntime is
updated (while both are still in the old coordinate system).
Scale: Project this physical slice to the new weight's virtual domain.
Clamp (Safety): Apply the new slice to the new vruntime, but strictly
clamp it with min_vruntime(..., se->deadline).
This approach preserves the physical time promise (fixing the bug) but
includes the min_vruntime guardrail you suggested to ensure we never
exceed the deadline/lag limit.
Does this align with what you had in mind?
Best,
Tao
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bdd8c4e688ae..52dc7deea7a1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3755,10 +3755,17 @@ static void reweight_entity(struct cfs_rq
*cfs_rq, struct sched_entity *se,
unsigned long weight)
{
bool curr = cfs_rq->curr == se;
+ bool protect_slice = curr && protect_slice(se);
+ u64 vslice_rem = 0;
if (se->on_rq) {
+
+ if (protect_slice) {
+ vslice_rem = se->vprot - se->vruntime;
+ }
+
update_entity_lag(cfs_rq, se);
se->deadline -= se->vruntime;
se->rel_deadline = 1;
@@ -3792,8 +3799,13 @@ static void reweight_entity(struct cfs_rq
*cfs_rq, struct sched_entity *se,
if (!curr)
__enqueue_entity(cfs_rq, se);
cfs_rq->nr_queued++;
- if (curr)
- set_protect_slice(cfs_rq, se);
+
+ if (protect_slice && vslice_rem > 0) {
+ vslice_rem = div_s64(vslice_rem *
se->load.weight, weight);
+ se->vprot == se->vruntime + vslice_rem;
+
+ se->vprot = min_vruntime(se->vprot, se->deadline);
+ }
}
}
Powered by blists - more mailing lists