lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 11 Apr 2017 11:24:26 +0200
From:   Daniel Bristot de Oliveira <bristot@...hat.com>
To:     xlpang@...hat.com, linux-kernel@...r.kernel.org
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Luca Abeni <luca.abeni@...tannapisa.it>,
        Steven Rostedt <rostedt@...dmis.org>,
        Tommaso Cucinotta <tommaso.cucinotta@...up.it>,
        RĂ´mulo Silva de Oliveira 
        <romulo.deoliveira@...c.br>,
        Mathieu Poirier <mathieu.poirier@...aro.org>
Subject: Re: [PATCH] sched/deadline: Throttle a constrained task activated if
 overflow

On 04/11/2017 09:06 AM, Xunlei Pang wrote:
> On 04/11/2017 at 01:53 PM, Xunlei Pang wrote:
>> On 04/11/2017 at 04:47 AM, Daniel Bristot de Oliveira wrote:
>>> On 04/10/2017 11:22 AM, Xunlei Pang wrote:
>>>> I was testing Daniel's changes with his test case in the commit
>>>> df8eac8cafce ("sched/deadline: Throttle a constrained deadline
>>>> task activated after the deadline"), and tweaked it a little.
>>>>
>>>> Instead of having the runtime equal to the deadline, I tweaked
>>>> runtime, deadline and sleep value to ensure every time it calls
>>>> dl_check_constrained_dl() with "dl_se->deadline > rq_clock(rq)"
>>>> as well as true dl_entity_overflow(), so it does replenishing
>>>> every wake up in update_dl_entity(), and break its bandwidth.
>>>>
>>>> Daniel's test case had:
>>>> attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */
>>>> attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms*/
>>>> attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */
>>>> ts.tv_sec = 0;
>>>> ts.tv_nsec = 2000 * 1000; /* 2 ms */
>>>>
>>>> I changed it to:
>>>> attr.sched_runtime = 5 * 1000 * 1000; /* 5 ms */
>>>> attr.sched_deadline = 7 * 1000 * 1000; /* 7 ms */
>>>> attr.sched_period = 1 * 1000 * 1000 * 1000; /* 1 s */
>>>> ts.tv_sec = 0;
>>>> ts.tv_nsec = 1000 * 1000; /* 1 ms */
>>>>
>>>> The change above can result in over 25% of the CPU on my machine.
>>>>
>>>> In order to avoid the beakage, we improve dl_check_constrained_dl()
>>>> to prevent dl tasks from being activated until the next period if it
>>>> runs out of bandwidth of the current period.
>>> The problem now is that, with your patch, we will throttle the task
>>> with some possible runtime. Moreover, the task did not brake any
>>> rule, like being awakened after the deadline - the user-space is not
>>> misbehaving.
>>>
>>> That is +- what the reproducer is doing when using your patch,
>>> (I put some trace_printk when noticing the overflow in the wakeup).
>>>
>>>           <idle>-0     [007] d.h.  1505.066439: enqueue_task_dl: my current runtime is 3657361 and the deadline is 4613027 from now 
>>>           <idle>-0     [007] d.h.  1505.066439: enqueue_task_dl: 	my dl_runtime is 5000000
>>>
>>> and so the task will be throttled with 3657361 ns runtime available.
>>>
>>> As we can see, it is really breaking the density:
>>>
>>> 5ms / 7ms (.714285) < 3657361 / 4613027 (.792833)
>>>
>>> Well, it is not breaking that much. Trying to be less pessimist, we can
>>> compute a new runtime with following equation:
>>>
>>> runtime = (dl_runtime / dl_deadline) * (deadline - now)
>>>
>>> That is, a runtime which fits in the task's density.
>> This is a good point, to make the best use of remaining deadline, let me think more.
> I don't know if this will make things more complicated, we can see in update_dl_entity():
>    if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>         dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) {
>         dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
>         dl_se->runtime = pi_se->dl_runtime;
>     }
> 
> Looks like we have similar issue for non-constrained tasks in case of true dl_entity_overflow(), although
> its deadline is promoted(BTW, I think when overflow it should be dl_se->deadline += pi_se->dl_deadline?),
> the previous runtime is discarded, we may need to apply the same runtime truncating logic on it as well
> if we want to truncate runtime.

For implicit deadline, the density is equals to the utilization.
So things here are less pessimistic, and we are using the same
rule for overflow and admission.

If you push "dl_se->deadline += pi_se->dl_deadline", you will give

runtime / (deadline - now) + period.

runtime.

As the deadline is in the future, (the !dl_time_before(dl_se->deadline,
rq_clock(rq)), then (deadline - now) + period > period. Therefore, the
next period will receive less than runtime / period & the task will
receive a lower priority - a farther deadline.

Well, we could not to truncate the runtime, but than we would carry
things from the past. If the task self-suspended for a long
period, the sum of the residual runtime + runtime could give
a higher utilization than Q/P.

Moreover, the current rule provides a guarantee that Tommaso likes.
The CBS self-adjusts the period in the case of a timing drift of the
external "event" that activates the task, without breaking Q/P.

(He can explain more about it....)

The current rule guarantees Q/P, that is the well known CBS.

Am I missing something?

-- Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ