[ Impact: Fixes the large vruntime spread problems I identified last fall, but might have bad side-effects on Xorg interactivity. See the INTERACTIVE feature in a following patch that addresses this. ] Push the scheduler dynamic min_vruntime upon deschedule. This ensures that the following workload won't grow the spread to insanely large values over time (give it 1-2 minutes), thus making the scheduler behave oddly with combined Xorg and latency-sensitive threads: Xorg gets at the beginning of the spread, and the latency-sensitive workloads get to be somewhere in the middle of the spread. periodic-fork.sh: #!/etc/sh while ((1)); do tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; tac /etc/passwd > /dev/null; sleep 1; done My test program is wakeup-latency.c, provided by Nokia originally. A 10ms timer spawns a thread which reads the time, and shows a warning if the expected deadline has been missed by too much. It also warns about timer overruns. It's available at: http://www.efficios.com/pub/elc2010/wakeup-latency-0.1.tar.bz2 With periodic-fork.sh running and Xorg, without the DYN_MIN_VRUNTIME feature, but with the INTERACTIVE, INTERACTIVE_FORK_EXPEDITED, TIMER and TIMER_FORK_EXPEDITED features enabled: .... min priority: 0, max priority: 0 late by: 6765.8 µs late by: 5536.1 µs overruns: 1 late by: 12212.3 µs late by: 5477.5 µs overruns: 1 late by: 12259.3 µs overruns: 1 late by: 12224.9 µs overruns: 1 late by: 12214.3 µs overruns: 1 late by: 12196.2 µs maximum latency: 12259.3 µs average latency: 46.4 µs missed timer events: 5 Now same workload with the DYN_MIN_VRUNTIME feature enabled: min priority: 0, max priority: 0 maximum latency: 2908.3 µs average latency: 6.9 µs missed timer events: 0 Inspired from a patch done by Peter Zijlstra. Signed-off-by: Mathieu Desnoyers CC: Peter Zijlstra --- kernel/sched_fair.c | 15 ++++++++++----- kernel/sched_features.h | 6 ++++++ 2 files changed, 16 insertions(+), 5 deletions(-) Index: linux-2.6-lttng.git/kernel/sched_fair.c =================================================================== --- linux-2.6-lttng.git.orig/kernel/sched_fair.c +++ linux-2.6-lttng.git/kernel/sched_fair.c @@ -301,9 +301,9 @@ static inline s64 entity_key(struct cfs_ return se->vruntime - cfs_rq->min_vruntime; } -static void update_min_vruntime(struct cfs_rq *cfs_rq) +static void update_min_vruntime(struct cfs_rq *cfs_rq, unsigned long delta_exec) { - u64 vruntime = cfs_rq->min_vruntime; + u64 vruntime = cfs_rq->min_vruntime, new_vruntime; if (cfs_rq->curr) vruntime = cfs_rq->curr->vruntime; @@ -319,7 +319,12 @@ static void update_min_vruntime(struct c vruntime = min_vruntime(vruntime, se->vruntime); } - cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime); + new_vruntime = cfs_rq->min_vruntime; + if (sched_feat(DYN_MIN_VRUNTIME) && delta_exec) + new_vruntime += calc_delta_mine(delta_exec, NICE_0_LOAD, + &cfs_rq->load); + + cfs_rq->min_vruntime = max_vruntime(new_vruntime, vruntime); } /* @@ -513,7 +518,7 @@ __update_curr(struct cfs_rq *cfs_rq, str delta_exec_weighted = calc_delta_fair(delta_exec, curr); curr->vruntime += delta_exec_weighted; - update_min_vruntime(cfs_rq); + update_min_vruntime(cfs_rq, delta_exec); } static void update_curr(struct cfs_rq *cfs_rq) @@ -822,7 +827,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st if (se != cfs_rq->curr) __dequeue_entity(cfs_rq, se); account_entity_dequeue(cfs_rq, se); - update_min_vruntime(cfs_rq); + update_min_vruntime(cfs_rq, 0); /* * Normalize the entity after updating the min_vruntime because the Index: linux-2.6-lttng.git/kernel/sched_features.h =================================================================== --- linux-2.6-lttng.git.orig/kernel/sched_features.h +++ linux-2.6-lttng.git/kernel/sched_features.h @@ -57,6 +57,12 @@ SCHED_FEAT(LB_SHARES_UPDATE, 1) SCHED_FEAT(ASYM_EFF_LOAD, 1) /* + * Push the min_vruntime spread floor value when descheduling a task. This + * ensures the spread does not grow beyond control. + */ +SCHED_FEAT(DYN_MIN_VRUNTIME, 0) + +/* * Spin-wait on mutex acquisition when the mutex owner is running on * another cpu -- assumes that when the owner is running, it will soon * release the lock. Decreases scheduling overhead. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/