[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1437297060-25378-1-git-send-email-byungchul.park@lge.com>
Date: Sun, 19 Jul 2015 18:11:00 +0900
From: byungchul.park@....com
To: mingo@...nel.org, peterz@...radead.org
Cc: linux-kernel@...r.kernel.org,
Byungchul Park <byungchul.park@....com>
Subject: [PATCH v3] sched: modify how to compute a slice and check a preemptability
From: Byungchul Park <byungchul.park@....com>
hello all,
i asked a question like below, in last version(=v2) patch.
***
the sysctl_sched_min_granularity must be defined clearly at first. after
defining that clearly, the way to work can be set. the definition can
be either case 1 or case 2 below.
case 1. any task must have at least sysctl_sched_min_granularity slice, which
is currently 0.75ms. in this case, increasing the number of tasks in a rq can
cause stretching a whole latency, which most of you don't like because it can
stretch the whole latency too much. but it looks normal to me since it already
happens in !CONFIG_FAIR_GROUP_SCHED world with the large number of tasks.
i wonder why CONFIG_FAIR_GROUP_SCHED world must be different with
!CONFIG_FAIR_GROUP_SCHED world? anyway...
case 2. tasks can have a slice much smaller than sysctl_sched_min_granularity,
according to the position in hierarchy. if a rq has 8 same weighted sched
entities and each entities has 8 same weighted sched entities and do it one
more, then a task can have a very small slice, e.g. 0.75ms / 64 ~ 0.01ms.
if you add more level to cgroup, it would get worse. in this situation,
context switching overhead becomes very large. what does it mean
sysctl_sched_min_granularity here? anyway...
i am not sure which is the right definition of sysctl_sched_min_granularity
between case 1 and case 2. what do you think about this?
***
i wrote this v3 patch based on the case 1 assuming the case 1 is right.
if the case 2 is right, then modifications in check_preempt_tick() should
be ignored.
doesn't it make sense?
thank you,
byungchul
---------------->8----------------
>From 7ebce566af9b952d24494cd1258b481ec6639cc1 Mon Sep 17 00:00:00 2001
From: Byungchul Park <byungchul.park@....com>
Date: Sun, 19 Jul 2015 17:11:37 +0900
Subject: [PATCH v3] sched: modify how to compute a slice and check a
preemptability
make cfs scheduler use rq level nr_running to compute a period in the case
of CONFIG_FAIR_GROUP_SCHED. using local cfs's nr_running to get period is
very weird. for example, imagine cgroup structure below.
root(=rq.cfs)--group1----a
|---b
|---c
|---d
|---e
|---f
|---g
|---h
|---i
|---j
|---k
|---l
|---m
in this case, group1's slice is not comparable to (a's slice + ... + m's
slice) with current code. it makes code using sum_exec_runtime weird, too.
it happens since current code does not use a consistent global wide thing
to get a global wide period.
in addition, modify preempt checking code to ensure that a sched entity
has at least sysctl_sched_min_granularity granularity for preemption.
Signed-off-by: Byungchul Park <byungchul.park@....com>
---
kernel/sched/fair.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 09456fc..41c619f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -635,7 +635,7 @@ static u64 __sched_period(unsigned long nr_running)
*/
static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
- u64 slice = __sched_period(cfs_rq->nr_running + !se->on_rq);
+ u64 slice = __sched_period(rq_of(cfs_rq)->cfs.nr_running + !se->on_rq);
for_each_sched_entity(se) {
struct load_weight *load;
@@ -3226,6 +3226,12 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
struct sched_entity *se;
s64 delta;
+ /*
+ * Ensure that a task executes at least for sysctl_sched_min_granularity
+ */
+ if (delta_exec < sysctl_sched_min_granularity)
+ return;
+
ideal_runtime = sched_slice(cfs_rq, curr);
delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
if (delta_exec > ideal_runtime) {
@@ -3243,9 +3249,6 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
* narrow margin doesn't have to wait for a full slice.
* This also mitigates buddy induced latencies under load.
*/
- if (delta_exec < sysctl_sched_min_granularity)
- return;
-
se = __pick_first_entity(cfs_rq);
delta = curr->vruntime - se->vruntime;
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists