[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAHb70uz8SZ49MSxyXvZfd-5AF7Y9KuUYbcKaZ_mbXD2zgu_0Yw@mail.gmail.com>
Date: Tue, 10 Dec 2024 17:22:46 +0800
From: zihan zhou <15645113830zzh@...il.com>
To: peterz@...radead.org
Cc: bsegall@...gle.com, dietmar.eggemann@....com, juri.lelli@...hat.com,
linux-kernel@...r.kernel.org, mgorman@...e.de, mingo@...hat.com,
rostedt@...dmis.org, vincent.guittot@...aro.org, vschneid@...hat.com,
yaozhenguo@...com, zhouzihan30@...com
Subject: Re: [PATCH] sched: Forward deadline for early tick
Hello, may I ask what you think of my latest patch
(https://lore.kernel.org/all/20241127070534.62903-1-zhouzihan30@jd.com/)?
If there are any shortcomings here, I would be happy to continue
making revisions.
(I'm sorry, there seems to be a formatting issue with the email I
sent, so I sent it again)
zihan zhou <15645113830zzh@...il.com> 于2024年12月10日周二 16:58写道:
>
> Hello, may I ask what you think of my latest patch (https://lore.kernel.org/all/20241127070534.62903-1-zhouzihan30@jd.com/)? If there are any shortcomings here, I would be happy to continue making revisions.
>
> zhouzihan30 <15645113830zzh@...il.com> 于2024年11月27日周三 15:06写道:
>>
>> From: zhouzihan <zhouzihan30@...com>
>>
>> Thank you very much for your reply, which has brought me lots
>> of thoughts.
>>
>> I have reconsidered this issue and believe that the root cause is that
>> the kernel is difficult and unnecessary to implement an ideal eevdf
>> due to real hardware:
>> for example,
>> an ideal eevdf may bring frequent switching, its cost makes kernel can't
>> really do it.
>>
>> So I see that the kernel has a very concise and clever implementation:
>> update_deadline, which allows each task to use up the request size
>> as much as possible in one go.
>>
>> Here, the kernel has actually slightly violated eevdf: we are no longer
>> concerned about whether a task is eligible for the time being.
>>
>> In the prev patch, it was mentioned that due to tick errors, some tasks
>> run longer than requested. So if we can do this: when a task vruntime
>> approaches the deadline, we check if the task is eligible.
>> If it is eligible, we follow the previous logic and do not schedule it.
>> However, if it is ineligible, we schedule it because eevdf has the
>> responsibility to not exec ineligible task.
>>
>> In other words, the kernel has given the task a "benefit" based on the
>> actual situation, and we still have the right to revoke this benefit.
>>
>> In this way, it actually brings the kernel closer to an ideal eevdf,
>> and at the same time, your reply made me realize my mistake:
>> The deadline update should be updated using the following function,
>> which is more reasonable:
>> vd_i += r_i / w_i
>> By using it, our scheduler is still fair,
>> and each task can obtain its own time according to its weight.
>>
>> About tolerance, I think min(vslice/128, tick/2) is better,
>> as your reply, vslice maybe too big, so use it.
>>
>> However, there is still a new issue here:
>> If a se 1 terminates its deadline prematurely due to ineligible,
>> and then a se 2 runs, after the end of the run,
>> se 1 becomes eligible, but its deadline has already been pushed back,
>> which is not earliest eligible,
>> so se 1 can't run. This seems to not comply with eevdf specifications.
>>
>> But I think this is acceptable. In the past, the logic causes a task to
>> run an extra tick (ms), which means other tasks have to wait for an
>> extra tick. Now, a task maybe run less time (us), but it will be
>> compensated for in the next scheduling. In terms of overall accuracy,
>> I think it has improved.
>>
>> By the way, we may try to solve this by delaying the deadline update,
>> which means we schedule a task but not update its deadline,
>> util next exec it.
>> I am not sure if it is necessary to implement such complex logic here.
>> I think it is actually unnecessary,
>> because the ideal eevdf may not be suitable for the kernel.
>> It is no need to spend so much to solve the error of few time.
>> If there is a truly completely accurate system, it should not have
>> tick error, and just closes the FORWARD_DEADLINE feature.
>> Of course, if you think it is necessary, I am willing try to implement it.
>>
>> Signed-off-by: zhouzihan30 <zhouzihan30@...com>
>> ---
>> kernel/sched/fair.c | 41 +++++++++++++++++++++++++++++++++++++----
>> kernel/sched/features.h | 7 +++++++
>> 2 files changed, 44 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 2d16c8545c71..e6e58c7d6d4c 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -1006,8 +1006,10 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se);
>> */
>> static bool update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> {
>> - if ((s64)(se->vruntime - se->deadline) < 0)
>> - return false;
>> +
>> + u64 vslice;
>> + u64 tolerance = 0;
>> + u64 next_deadline;
>>
>> /*
>> * For EEVDF the virtual time slope is determined by w_i (iow.
>> @@ -1016,11 +1018,42 @@ static bool update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> */
>> if (!se->custom_slice)
>> se->slice = sysctl_sched_base_slice;
>> + vslice = calc_delta_fair(se->slice, se);
>> +
>> + next_deadline = se->vruntime + vslice;
>> +
>> + if (sched_feat(FORWARD_DEADLINE))
>> + tolerance = min(vslice>>7, TICK_NSEC/2);
>> +
>> + if ((s64)(se->vruntime + tolerance - se->deadline) < 0)
>> + return false;
>>
>> /*
>> - * EEVDF: vd_i = ve_i + r_i / w_i
>> + * when se->vruntime + tolerance - se->deadline >= 0
>> + * but se->vruntime - se->deadline < 0,
>> + * there is two case: if entity is eligible?
>> + * if entity is not eligible, we don't need wait deadline, because
>> + * eevdf don't guarantee
>> + * an ineligible entity can exec its request time in one go.
>> + * but when entity eligible, just let it run, which is the
>> + * same processing logic as before.
>> */
>> - se->deadline = se->vruntime + calc_delta_fair(se->slice, se);
>> +
>> + if (sched_feat(FORWARD_DEADLINE) && (s64)(se->vruntime - se->deadline) < 0) {
>> + if (entity_eligible(cfs_rq, se)) {
>> + return false;
>> + } else {
>> + /*
>> + * in this case, entity's request size does not use light,
>> + * but considering it is not eligible, we don't need exec it.
>> + * and we let vd_i += r_i / w_i, make scheduler fairness.
>> + */
>> + next_deadline = se->deadline + vslice;
>> + }
>> + }
>> +
>> +
>> + se->deadline = next_deadline;
>>
>> /*
>> * The task has consumed its request, reschedule.
>> diff --git a/kernel/sched/features.h b/kernel/sched/features.h
>> index 290874079f60..5c74deec7209 100644
>> --- a/kernel/sched/features.h
>> +++ b/kernel/sched/features.h
>> @@ -24,6 +24,13 @@ SCHED_FEAT(RUN_TO_PARITY, true)
>> */
>> SCHED_FEAT(PREEMPT_SHORT, true)
>>
>> +/*
>> + * For some cases where the tick is faster than expected,
>> + * move the deadline forward
>> + */
>> +SCHED_FEAT(FORWARD_DEADLINE, true)
>> +
>> +
>> /*
>> * Prefer to schedule the task we woke last (assuming it failed
>> * wakeup-preemption), since its likely going to consume data we
>> --
>> 2.39.3 (Apple Git-146)
>>
Powered by blists - more mailing lists