lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 14 Aug 2015 07:20:20 +0800
From:	Yuyang Du <yuyang.du@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Byungchul Park <byungchul.park@....com>, mingo@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: sync with the cfs_rq when changing sched class

On Thu, Aug 13, 2015 at 05:22:12PM +0200, Peter Zijlstra wrote:
> > when did I say "don't need to consider..."?
> > 
> > Doing more does not mean better, or just trivial. BTW, the task switched_to
> > does not have to be switched_from before. 
> 
> Correct, there's a few corner cases we need to consider.
> 
> However, I think we unconditionally call init_entity_runnable_average()
> on all tasks, regardless of their 'initial' sched class, so it should
> have a valid state.
> 
> Another thing to consider is the state being very stale, suppose it
> started live as FAIR, ran for a bit, got switched to !FAIR by means of
> sys_sched_setscheduler()/sys_sched_setattr() or similar, runs for a long
> time and for some reason gets switched back to FAIR, we need to age and
> or re-init things.
> 
> I _think_ we can use last_update_time for that, but I've not looked too
> hard.
> 
> That is, age based on last_update_time, if all 0, reinit, or somesuch.
> 
> 
> The most common case of switched_from()/switched_to() is Priority
> Inheritance, and that typically results in very short lived stints as
> !FAIR and the avg data should be still accurate by the time we return.

Right, we have the following cases to consider:
queued or not * PI or sched_setscheduler * switched_from_fair or not

After enumerating them, it simply boils down to this basic question:
with either short or long lost record, what we predict?

It can have to do with perspective, preference, etc, and it differs
in how frequent it happens, which we lack. Otherwise, we don't have
the context to see what difference it makes.

To address this, I think the following patch might be helpful first.

--
sched: Provide sched class and priority change statistics to task

The sched class and priority changes make substantial impact for
a task, but we really have not a quantitative understanding of how
frequent they are, which makes it diffcult to work on many cases
especially some corner cases. We privide it at task level.

Signed-off-by: Yuyang Du <yuyang.du@...el.com>
---
 include/linux/sched.h |    3 +++
 kernel/sched/core.c   |    8 +++++++-
 kernel/sched/debug.c  |    2 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5b50082..9111696 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1228,6 +1228,9 @@ struct sched_statistics {
 	u64			nr_wakeups_affine_attempts;
 	u64			nr_wakeups_passive;
 	u64			nr_wakeups_idle;
+
+	u64			class_change;
+	u64			priority_change;
 };
 #endif
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 655557d..b65af01 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1009,12 +1009,18 @@ static inline void check_class_changed(struct rq *rq, struct task_struct *p,
 				       int oldprio)
 {
 	if (prev_class != p->sched_class) {
+		schedstat_inc(p, se.statistics.class_change);
+		schedstat_inc(p, se.statistics.priority_change);
+
 		if (prev_class->switched_from)
 			prev_class->switched_from(rq, p);
 
 		p->sched_class->switched_to(rq, p);
-	} else if (oldprio != p->prio || dl_task(p))
+	} else if (oldprio != p->prio || dl_task(p)) {
+		schedstat_inc(p, se.statistics.priority_change);
+
 		p->sched_class->prio_changed(rq, p, oldprio);
+	}
 }
 
 void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 6415117..fac8b89 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -597,6 +597,8 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
 	P(se.statistics.nr_wakeups_affine_attempts);
 	P(se.statistics.nr_wakeups_passive);
 	P(se.statistics.nr_wakeups_idle);
+	P(se.statistics.class_change);
+	P(se.statistics.priority_change);
 
 	{
 		u64 avg_atom, avg_per_cpu;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ