Mike Galbraith reported poor interactivity[*] when the new shares distribution code was combined with autogroups. The root cause turns out to be a mis-ordering of accounting accrued execution time and shares updates. Since update_curr() is issued hierarchically, updating the parent entity weights to reflect child enqueue/dequeue results in the parent's unaccounted execution time then being accrued (vs vruntime) at the new weight as opposed to the weight present at accumulation. While this doesn't have much effect on processes with timeslices that cross a tick, it is particularly problematic for an interactive process (e.g. Xorg) which incurs many (tiny) timeslices. In this scenario almost all updates are at dequeue which can result in significant fairness perturbation (especially if it is the only thread, resulting in potential {tg->shares, MIN_SHARES} transitions). Correct this by ensuring unaccounted time is accumulated prior to manipulating an entity's weight. [*] http://xkcd.com/619/ is perversely Nostradamian here. Signed-off-by: Paul Turner --- kernel/sched_fair.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) Index: tip3/kernel/sched_fair.c =================================================================== --- tip3.orig/kernel/sched_fair.c +++ tip3/kernel/sched_fair.c @@ -767,8 +767,12 @@ static void update_cfs_load(struct cfs_r static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, unsigned long weight) { - if (se->on_rq) + if (se->on_rq) { + /* commit outstanding execution time */ + if (cfs_rq->curr == se) + update_curr(cfs_rq); account_entity_dequeue(cfs_rq, se); + } update_load_set(&se->load, weight); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/