linux-kernel - [PATCH] sched/deadline: Always calculate end of period on sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 12 Feb 2016 18:10:20 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	LKML <linux-kernel@...r.kernel.org>
Cc:	Juri Lelli <juri.lelli@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Clark Williams <williams@...hat.com>,
	Daniel Bristot de Oliveira <bristot@...hat.com>,
	John Kacur <jkacur@...hat.com>
Subject: [PATCH] sched/deadline: Always calculate end of period on
 sched_yield()

I'm writing a test case for SCHED_DEADLINE, and notice a strange
anomaly. Every so often, a deadline is missed and when I looked into
it, it happened because the sched_yield() had no effect (it didn't end
the previous period and let the start of the next runtime happen on the
end of the old period).

deadline-2228    7...1   116.778420: sys_enter_sched_yield: 
deadline-2228    7d..3   116.778421: hrtimer_cancel:       hrtimer=0xffff88011ebd79a0
deadline-2228    7d..2   116.778422: rcu_utilization:      Start context switch
deadline-2228    7d..2   116.778423: rcu_utilization:      End context switch
deadline-2228    7d..4   116.778423: hrtimer_start:        hrtimer=0xffff88011ebd79a0 function=hrtick/0x0 expires=116124420428 softexpires=116124420428
deadline-2228    7...1   116.778425: sys_exit_sched_yield: 0x0

Schedule was never called. A added some trace_printks() and discovered
that this happens when sched_yield() is called right after a tick that
updates its current bandwidth.

When the schedule tick happens that updates the current bandwidth,
update_curr_dl() is called, where it updates curr->se.exec_start to
rq_clock_task(rq).

The rq_clock_task(rq) gets updated by update_rq_clock_task() that gets
update by various points in the scheduler.

Now, if the user task calls sched_yield() just after a bandwidth update
synced curr->se.exec_start to rq_clock_task(rq), when sched_yield()
calls into update_curr_dl() we have:

	delta_exec = rq_clock_task(rq) - curr->se.exec_start;
	if (unlikely((s64)delta_exec <= 0))
		return;

Coming in here from a sched_yield() will have delta_exec == 0 if the
sched_yield() was called after a DL tick and before another
update_rq_clock_task() is called.

This means that the task will not release its remaining runtime, and
the will start off in the current period when it expected to be in the
next period.

The fix that appears to work for me is to add a test in
update_curr_dl() to not exit if delta_exec is zero and
dl_se->dl_yielded is true.

Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index cd64c979d0e1..1dd180cda574 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -735,7 +735,7 @@ static void update_curr_dl(struct rq *rq)
 	 * approach need further study.
 	 */
 	delta_exec = rq_clock_task(rq) - curr->se.exec_start;
-	if (unlikely((s64)delta_exec <= 0))
+	if (unlikely((s64)delta_exec <= 0 && !dl_se->dl_yielded))
 		return;

 	schedstat_set(curr->se.statistics.exec_max,