[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20101012094622.1611DCE142@carpathia.dereferenced.org>
Date: Tue, 12 Oct 2010 00:32:02 -0500
From: William Pitcock <nenolod@...eferenced.org>
To: linux-kernel@...r.kernel.org
Cc: mingo@...e.hu, peterz@...radead.org, efault@....de,
kernel@...ivas.org
Subject: [PATCH try 6] CFS: Add hierarchical tree-based penalty.
Inspired by the recent change to BFS by Con Kolivas, this patch causes
vruntime to be penalized based on parent depth from their root task
group.
I have, for the moment, decided to make it a default feature since the
design of CFS ensures that broken applications depending on task enqueue
behaviour behaving traditionally will continue to work.
Changelog:
try6:
- nothing but this changelog.
try5:
- rename task_struct.parent_count to task_struct.fork_depth.
- ensure the sched_entity we're working on is actually a task before doing
anything
- multiply vruntime by the fork depth instead of dividing.
try4:
- some crazy thing involving sched_vslice() and dividing it's value based on
number of forks. had an interesting effect, but it was wrong.
Signed-off-by: William Pitcock <nenolod@...eferenced.org>
---
include/linux/sched.h | 2 ++
kernel/sched.c | 4 ++++
kernel/sched_fair.c | 8 ++++++++
kernel/sched_features.h | 12 ++++++++++++
4 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1e2a6db..3f0cff8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1494,6 +1494,8 @@ struct task_struct {
unsigned long memsw_bytes; /* uncharged mem+swap usage */
} memcg_batch;
#endif
+
+ int fork_depth;
};
/* Future-safe accessor for struct task_struct's cpus_allowed. */
diff --git a/kernel/sched.c b/kernel/sched.c
index dc85ceb..3fc3ebc 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2621,6 +2621,10 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
#endif
rq = task_rq_lock(p, &flags);
+
+ if (!(clone_flags & CLONE_THREAD))
+ p->fork_depth++;
+
activate_task(rq, p, 0);
trace_sched_wakeup_new(p, 1);
check_preempt_curr(rq, p, WF_FORK);
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index db3f674..a81acc5 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -737,6 +737,14 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
if (initial && sched_feat(START_DEBIT))
vruntime += sched_vslice(cfs_rq, se);
+ if (sched_feat(HIERARCHICAL_PENALTY) &&
+ likely(entity_is_task(se))) {
+ struct task_struct *tsk = task_of(se);
+
+ if (tsk->fork_depth > 1)
+ vruntime *= tsk->fork_depth;
+ }
+
/* sleeps up to a single latency don't count. */
if (!initial) {
unsigned long thresh = sysctl_sched_latency;
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 83c66e8..cf17f97 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -45,6 +45,18 @@ SCHED_FEAT(LAST_BUDDY, 1)
SCHED_FEAT(CACHE_HOT_BUDDY, 1)
/*
+ * Hierarchical tree-based penalty: penalize service deficit by
+ * an order of magnitude for each parent process in the process
+ * tree. This has the natural effect of forcing preference towards
+ * processes that are not fork()-hungry, like make(1), which helps
+ * to preserve good latency.
+ *
+ * This also has the side effect of providing in a limited way,
+ * per-user CPU entitlement partitioning.
+ */
+SCHED_FEAT(HIERARCHICAL_PENALTY, 1)
+
+/*
* Use arch dependent cpu power functions
*/
SCHED_FEAT(ARCH_POWER, 0)
--
1.7.2.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists