linux-kernel - Re: [PATCH] sched/deadline: Throttle a constrained task activated if overflow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58EEEB20.9070609@redhat.com>
Date:   Thu, 13 Apr 2017 11:06:08 +0800
From:   Xunlei Pang <xpang@...hat.com>
To:     luca abeni <luca.abeni@...tannapisa.it>
Cc:     Daniel Bristot de Oliveira <bristot@...hat.com>,
        Xunlei Pang <xlpang@...hat.com>, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Tommaso Cucinotta <tommaso.cucinotta@...up.it>,
        Rômulo Silva de Oliveira 
        <romulo.deoliveira@...c.br>,
        Mathieu Poirier <mathieu.poirier@...aro.org>
Subject: Re: [PATCH] sched/deadline: Throttle a constrained task activated if
 overflow

On 04/12/2017 at 09:10 PM, luca abeni wrote:
> On Wed, 12 Apr 2017 20:30:04 +0800
> Xunlei Pang <xpang@...hat.com> wrote:
> [...]
>>> If the relative deadline is different from the period, then the
>>> check is an approximation (and this is the big issue here). I am
>>> still not sure about what is the best thing to do in this case.
>>>  
>>>> E.g. For (runtime 2ms, deadline 4ms, period 8ms),
>>>> for some reason was preempted after running a very short time
>>>> 0.1ms, after 0.9ms it was scheduled back and immediately sleep
>>>> 1ms, when it is awakened, remaining runtime is 1.9ms, remaining
>>>> deadline(deadline-now) within this period is 2ms, but
>>>> dl_entity_overflow() is true. However, clearly it can be correctly
>>>> run 1.9ms before deadline comes wthin this period.  
>>> Sure, in this case the task can run for 1.9ms before the deadline...
>>> But doing so, it might break the guarantees for other deadline
>>> tasks. This is what the check is trying to avoid.  
>> Image this deadline task was preempted after running 0.1ms by another
>> one with an earlier absolute deadline for 1.9ms, after scheduled back
>> it will have the same status (1.9ms remaining runtime, 2ms remaining
>> deadline) under the current implementation.
> The big difference is that in your first example the task blocks for
> some time while being the earliest-deadline task (so, the scheduling
> algorithm would give some execution time to the task, but the task
> does not use it... So, when the task wakes up, it has to check if now it
> is too late for using its reserved execution time).
> On the other hand, in this second example the task is preempted by an
> earlier-deadline (higher priority) task, so there is no risk to have
> execution time reserved to the task that the task does not use
> (if I understand well your examples).

Yes, make sense, thanks!

>
>
>>>> We can add a condition in dl_runtime_exceeded(), if its deadline is
>>>> passed, then zero its runtime if positive, and a new period begins.
>>>>
>>>> I did some tests with the following patch, seems it works well,
>>>> please correct me if I am wrong. ---
>>>>  kernel/sched/deadline.c | 16 ++++++++++++----
>>>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>> index a2ce590..600c530 100644
>>>> --- a/kernel/sched/deadline.c
>>>> +++ b/kernel/sched/deadline.c
>>>> @@ -498,8 +498,7 @@ static void update_dl_entity(struct
>>>> sched_dl_entity *dl_se, struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>>>>      struct rq *rq = rq_of_dl_rq(dl_rq);
>>>>  
>>>> -    if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>>>> -        dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) {
>>>> +    if (!dl_time_before(rq_clock(rq), dl_se->deadline)) {
>>>>          dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
>>>>          dl_se->runtime = pi_se->dl_runtime;
>>>>      }  
>>> I think this will break the guarantees for tasks with relative
>>> deadline equal to period (think about a task with runtime 5ms,
>>> period 10ms and relative deadline 10ms... What happens if the task
>>> executes for 4.9ms, then blocks and immediately wakes up?)  
>> For your example, dl_se->deadline is after now, the if condition is
>> false, update_dl_entity() actually does nothing, that is, after
>> wake-up, it will carry (5ms-4.9ms)=0.1ms remaining runtime and
>> (10ms-4.9ms)=5.1ms remaining deadline/period, continue within the
>> current period. After running further 0.1ms, will be throttled by the
>> timer in update_curr_dl().
> Ok, sorry; I saw you inverted rq_clock(rq) and dl_se->deadline), but I
> did not notice you added a "!". My fault.
>
> In this case, with this change the task cannot consume more than
> runtime every period (which is good), but it still risks to cause
> unexpected missed deadlines.
> (if relative deadline == period, on uni-processor the current algorithm
> guarantees that the runtime will always be consumed before the
> scheduling deadline; on multi-processor it guarantees that the runtime
> will be consumed before a maximum delay from the deadline; with this
> change, it is not possible to provide this guarantee anymore).
>
>
> 				Luca
>
>> It's actually the original logic, I just removed dl_entity_overflow()
>> condition.
>>
>>
>> Regards,
>> Xunlei
>>
>>>
>>> 				Luca
>>>  
>>>> @@ -722,13 +721,22 @@ static inline void
>>>> dl_check_constrained_dl(struct sched_dl_entity *dl_se)
>>>> dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { if
>>>> (unlikely(dl_se->dl_boosted || !start_dl_timer(p))) return;
>>>> +
>>>> +        if (dl_se->runtime > 0)
>>>> +            dl_se->runtime = 0;
>>>> +
>>>>          dl_se->dl_throttled = 1;
>>>>      }
>>>>  }
>>>>  
>>>>  static
>>>> -int dl_runtime_exceeded(struct sched_dl_entity *dl_se)
>>>> +int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity
>>>> *dl_se) {
>>>> +    if (!dl_time_before(rq_clock(rq), dl_se->deadline)) {
>>>> +        if (dl_se->runtime > 0)
>>>> +            dl_se->runtime = 0;
>>>> +    }
>>>> +
>>>>      return (dl_se->runtime <= 0);
>>>>  }
>>>>  
>>>> @@ -779,7 +787,7 @@ static void update_curr_dl(struct rq *rq)
>>>>      dl_se->runtime -= delta_exec;
>>>>  
>>>>  throttle:
>>>> -    if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
>>>> +    if (dl_runtime_exceeded(rq, dl_se) || dl_se->dl_yielded) {
>>>>          dl_se->dl_throttled = 1;
>>>>          __dequeue_task_dl(rq, curr, 0);
>>>>          if (unlikely(dl_se->dl_boosted
>>>> || !start_dl_timer(curr)))