[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160215101824.GD22643@pablo>
Date: Mon, 15 Feb 2016 10:18:24 +0000
From: Juri Lelli <juri.lelli@....com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Juri Lelli <juri.lelli@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Clark Williams <williams@...hat.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
John Kacur <jkacur@...hat.com>
Subject: Re: [PATCH] sched/deadline: Always calculate end of period on
sched_yield()
Hi,
On 12/02/16 18:10, Steven Rostedt wrote:
> I'm writing a test case for SCHED_DEADLINE, and notice a strange
> anomaly. Every so often, a deadline is missed and when I looked into
> it, it happened because the sched_yield() had no effect (it didn't end
> the previous period and let the start of the next runtime happen on the
> end of the old period).
>
> deadline-2228 7...1 116.778420: sys_enter_sched_yield:
> deadline-2228 7d..3 116.778421: hrtimer_cancel: hrtimer=0xffff88011ebd79a0
> deadline-2228 7d..2 116.778422: rcu_utilization: Start context switch
> deadline-2228 7d..2 116.778423: rcu_utilization: End context switch
> deadline-2228 7d..4 116.778423: hrtimer_start: hrtimer=0xffff88011ebd79a0 function=hrtick/0x0 expires=116124420428 softexpires=116124420428
> deadline-2228 7...1 116.778425: sys_exit_sched_yield: 0x0
>
>
> Schedule was never called. A added some trace_printks() and discovered
> that this happens when sched_yield() is called right after a tick that
> updates its current bandwidth.
>
> When the schedule tick happens that updates the current bandwidth,
> update_curr_dl() is called, where it updates curr->se.exec_start to
> rq_clock_task(rq).
>
> The rq_clock_task(rq) gets updated by update_rq_clock_task() that gets
> update by various points in the scheduler.
>
> Now, if the user task calls sched_yield() just after a bandwidth update
> synced curr->se.exec_start to rq_clock_task(rq), when sched_yield()
> calls into update_curr_dl() we have:
>
> delta_exec = rq_clock_task(rq) - curr->se.exec_start;
> if (unlikely((s64)delta_exec <= 0))
> return;
>
> Coming in here from a sched_yield() will have delta_exec == 0 if the
> sched_yield() was called after a DL tick and before another
> update_rq_clock_task() is called.
>
> This means that the task will not release its remaining runtime, and
> the will start off in the current period when it expected to be in the
> next period.
>
> The fix that appears to work for me is to add a test in
> update_curr_dl() to not exit if delta_exec is zero and
> dl_se->dl_yielded is true.
>
> Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
> ---
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index cd64c979d0e1..1dd180cda574 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq)
> * approach need further study.
> */
> delta_exec = rq_clock_task(rq) - curr->se.exec_start;
> - if (unlikely((s64)delta_exec <= 0))
> + if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded))
> return;
>
This looks good to me. Do you think we could also skip some of the
following updates/accounting in this case? Not sure we win anything by
doing that, though.
Thanks,
- Juri
Powered by blists - more mailing lists