On a single cpu system with SMT, in the scenario of one SMT thread being idle with another SMT thread running a task and doing a non sync wakeup of another task, we see (from the traces) that the woken up task ends up running on the busy thread instead of the idle thread. Idle balancing that comes in little bit later is fixing the scernaio. But fixing this wake balance and running the woken up task directly on the idle SMT thread improved the performance (phoronix 7zip compression workload) by ~9% on an atom platform. During the process wakeup, select_task_rq_fair() and wake_affine() makes the decision to wakeup the task either on the previous cpu that the task ran or the cpu that the task is currently woken up. select_task_rq_fair() also goes through to see if there are any idle siblings for the cpu that the task is woken up on. This is to ensure that we select any idle sibling rather than choose a busy cpu. In the above load scenario, it so happens that the prev_cpu (that the task ran before) and this_cpu (where it is woken up currently) are the same. And in this case, it looks like wake_affine() returns 0 and ultimately not selecting the idle sibling chosen by select_idle_sibling() in select_task_rq_fair(). Further down the path of select_task_rq_fair(), we ultimately select the currently running cpu (busy SMT thread instead of the idle SMT thread). Check for prev_cpu == this_cpu before calling wake_affine() and no need to do any fancy stuff(and ultimately taking wrong decisions) in this case. Signed-off-by: Suresh Siddha --- Changes from v1: Move the "this_cpu == prev_cpu" check before calling wake_affine() --- kernel/sched_fair.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) Index: tip/kernel/sched_fair.c =================================================================== --- tip.orig/kernel/sched_fair.c +++ tip/kernel/sched_fair.c @@ -1454,6 +1454,7 @@ static int select_task_rq_fair(struct ta int want_affine = 0; int want_sd = 1; int sync = wake_flags & WF_SYNC; + int this_cpu = cpu; if (sd_flag & SD_BALANCE_WAKE) { if (sched_feat(AFFINE_WAKEUPS) && @@ -1545,8 +1546,10 @@ static int select_task_rq_fair(struct ta update_shares(tmp); } - if (affine_sd && wake_affine(affine_sd, p, sync)) - return cpu; + if (affine_sd) { + if (this_cpu == prev_cpu || wake_affine(affine_sd, p, sync)) + return cpu; + } while (sd) { int load_idx = sd->forkexec_idx; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/