lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170210164758.029590e1@gandalf.local.home>
Date:   Fri, 10 Feb 2017 16:47:58 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Daniel Bristot de Oliveira <bristot@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@....com>,
        Tommaso Cucinotta <tommaso.cucinotta@...up.it>,
        Luca Abeni <luca.abeni@...tannapisa.it>
Subject: Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in
 the next period

On Fri, 10 Feb 2017 20:48:10 +0100
Daniel Bristot de Oliveira <bristot@...hat.com> wrote:

> Currently, the replenishment timer is set to fire at the deadline
> of a task. Although that works for implicit deadline tasks because the
> deadline is equals to the begin of the next period, that is not correct
> for constrained deadline tasks (deadline < period).
> 
> For instance:
> 
> f.c:
>  --------------- %< ---------------
> int main (void)
> {
> 	for(;;);
> }
>  --------------- >% ---------------  
> 
>   # gcc -o f f.c
> 
>   # trace-cmd record -e sched:sched_switch                              \
>    				   -e syscalls:sys_exit_sched_setattr   \
>    chrt -d --sched-runtime  490000000 					\
>            --sched-deadline 500000000 					\
> 	   --sched-period  1000000000 0 ./f
> 
>   # trace-cmd report | grep "{pid of ./f}"

 # trace-cmd report -F '.*:common_pid == 11295'

does the same ;-)

Or better yet, add "-F" (just -F, no extra paremeter. In report, -F
means "filter", in record -F means "follow"), and that will only record
the task that is created.

Also, it may be nice to use --ts-diff option to see results. I ran this
with my own spinning program, and here's the output:

 # trace-cmd report --ts-diff
[...]
        userspin-1540  [003]   298.725084: (+822)   sys_exit_sched_setattr: 0x0
        userspin-1540  [003]   299.218199: (+493115) sched_switch:         userspin:1540 [4294967295] R ==> rcuc/3:38 [98]
          <idle>-0     [003]   299.225111: (+6912)  sched_switch:         swapper/3:0 [120] R ==> userspin:1540 [4294967295]
        userspin-1540  [003]   299.718192: (+493081) sched_switch:         userspin:1540 [4294967295] R ==> ktimersoftd/3:39 [98]
          <idle>-0     [003]   300.225125: (+506933) sched_switch:         swapper/3:0 [120] R ==> userspin:1540 [4294967295]
        userspin-1540  [003]   300.717185: (+492060) sched_switch:         userspin:1540 [4294967295] R ==> rcuc/3:38 [98]
          <idle>-0     [003]   301.225116: (+507931) sched_switch:         swapper/3:0 [120] R ==> userspin:1540 [4294967295]
        userspin-1540  [003]   301.718177: (+493061) sched_switch:         userspin:1540 [4294967295] R ==> rcuc/3:38 [98]
          <idle>-0     [003]   302.225120: (+506943) sched_switch:         swapper/3:0 [120] R ==> userspin:1540 [4294967295]


> 
> After setting parameters, the task is replenished and continue running
> until being throttled.
>          f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0
> 
> The task is throttled after running 492318 ms, as expected.
>          f-11295 [003] 13322.606094: sched_switch:   f:11295 [-1] R ==> \
> 						   watchdog/3:32 [0]
> 
> But then, the task is replenished 500719 ms after the first
> replenishment:
>     <idle>-0     [003] 13322.614495: sched_switch:   swapper/3:0 [120] R \
> 						   ==> f:11295 [-1]  
> 
> Running for 490277 ms:
>          f-11295 [003] 13323.104772: sched_switch:   f:11295 [-1] R ==>  \
> 						   swapper/3:0 [120]
> 
> Hence, in the first period, the task runs 2 * runtime, and that is a bug.
> 
> During the first replenishment, the next deadline is set one period away.
> So the runtime/period starts to be respected. However, as the second
> replenishment took place in the wrong instant, the next replenishment
> will also be held in a wrong instant of time. Rather than occurring in
> the nth period away from the first activation, it is taking place
> in the (nth period - relative deadline).

OK, after you patch, I get this:

        userspin-1581  [003]   103.450129: (+716)   sys_exit_sched_setattr: 0x0
        userspin-1581  [003]   103.943957: (+493828) sched_switch:         userspin:1581 [4294967295] R ==> rcuc/3:38 [98]
          <idle>-0     [003]   104.450172: (+506215) sched_switch:         swapper/3:0 [120] R ==> userspin:1581 [4294967295]
        userspin-1581  [003]   104.942349: (+492177) sched_switch:         userspin:1581 [4294967295] R ==> watchdog/3:36 [0]
          <idle>-0     [003]   105.450164: (+507815) sched_switch:         swapper/3:0 [120] R ==> userspin:1581 [4294967295]
        userspin-1581  [003]   105.942354: (+492190) sched_switch:         userspin:1581 [4294967295] R ==> ktimersoftd/3:39 [98]
          <idle>-0     [003]   106.450159: (+507805) sched_switch:         swapper/3:0 [120] R ==> userspin:1581 [4294967295]
        userspin-1581  [003]   106.942366: (+492207) sched_switch:         userspin:1581 [4294967295] R ==> ktimersoftd/3:39 [98]
          <idle>-0     [003]   107.450164: (+507798) sched_switch:         swapper/3:0 [120] R ==> userspin:1581 [4294967295]

Looks promising. But unrelated, I noticed this with and without your
patch:

        userspin-1567  [002]    81.455345: (+1196)  sys_exit_sched_setattr: 0x0
        userspin-1567  [002]    81.457599: (+2254)  sched_switch:         chrt:1567 [4294967295] D ==> rcuc/2:29 [98]
          <idle>-0     [002]    81.466030: (+8431)  sched_switch:         swapper/2:0 [120] R ==> chrt:1567 [4294967295]
        userspin-1567  [002]    81.466123: (+93)    sched_switch:         chrt:1567 [4294967295] D ==> rcuc/2:29 [98]
          <idle>-0     [002]    81.471862: (+5739)  sched_switch:         swapper/2:0 [120] R ==> chrt:1567 [4294967295]
        userspin-1567  [002]    81.471959: (+97)    sched_switch:         chrt:1567 [4294967295] D ==> swapper/2:0 [120]
          <idle>-0     [002]    81.478285: (+6326)  sched_switch:         swapper/2:0 [120] R ==> chrt:1567 [4294967295]
        userspin-1567  [002]    81.478369: (+84)    sched_switch:         chrt:1567 [4294967295] D ==> swapper/2:0 [120]
          <idle>-0     [002]    81.486664: (+8295)  sched_switch:         swapper/2:0 [120] R ==> chrt:1567 [4294967295]
        userspin-1567  [002]    81.487291: (+627)   sched_switch:         chrt:1567 [4294967295] D|W ==> swapper/2:0 [120]
          <idle>-0     [002]    81.494884: (+7593)  sched_switch:         swapper/2:0 [120] R ==> chrt:1567 [4294967295]
        userspin-1567  [002]    81.988095: (+493211) sched_switch:         userspin:1567 [4294967295] R ==> rcuc/2:29 [98]
          <idle>-0     [002]    82.496009: (+507914) sched_switch:         swapper/2:0 [120] R ==> userspin:1567 [4294967295]
        userspin-1567  [002]    82.988158: (+492149) sched_switch:         userspin:1567 [4294967295] R ==> rcuc/2:29 [98]
          <idle>-0     [002]    83.494897: (+506739) sched_switch:         swapper/2:0 [120] R ==> userspin:1567 [4294967295]
        userspin-1567  [002]    83.986181: (+491284) sched_switch:         userspin:1567 [4294967295] R ==> rcuc/2:29 [98]
          <idle>-0     [002]    84.494895: (+508714) sched_switch:         swapper/2:0 [120] R ==> userspin:1567 [4294967295]

A bit of choppiness to get started. I'll have to look more into this.

I'll take a look at this patch deeper to try to understand everything
before applying a reviewed-by.

-- Steve

> 
> Signed-off-by: Daniel Bristot de Oliveira <bristot@...hat.com>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Juri Lelli <juri.lelli@....com>
> Cc: Tommaso Cucinotta <tommaso.cucinotta@...up.it>
> Cc: Luca Abeni <luca.abeni@...tannapisa.it>
> Cc: Steven Rostedt <rostedt@...dmis.org>
> Cc: linux-kernel@...r.kernel.org
> 
> ---
>  kernel/sched/deadline.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 70ef2b1..3c94d85 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
>  	}
>  }
>  
> +static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
> +{
> +	return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
> +}
> +
>  /*
>   * If the entity depleted all its runtime, and if we want it to sleep
>   * while waiting for some new execution time to become available, we
> - * set the bandwidth enforcement timer to the replenishment instant
> + * set the bandwidth replenishment timer to the replenishment instant
>   * and try to activate it.
>   *
>   * Notice that it is important for the caller to know if the timer
> @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
>  	 * that it is actually coming from rq->clock and not from
>  	 * hrtimer's time base reading.
>  	 */
> -	act = ns_to_ktime(dl_se->deadline);
> +	act = ns_to_ktime(dl_next_period(dl_se));
>  	now = hrtimer_cb_get_time(timer);
>  	delta = ktime_to_ns(now) - rq_clock(rq);
>  	act = ktime_add_ns(act, delta);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ