[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1648228023.git.tim.c.chen@linux.intel.com>
Date: Fri, 25 Mar 2022 15:54:15 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ingo Molnar <mingo@...e.hu>, Juri Lelli <juri.lelli@...hat.com>
Cc: Tim Chen <tim.c.chen@...ux.intel.com>,
Yu Chen <yu.c.chen@...el.com>,
Walter Mack <walter.mack@...el.com>,
Mel Gorman <mgorman@...e.de>, linux-kernel@...r.kernel.org
Subject: [PATCH 0/2] sched/fair: Fix starvation caused by task migration
Walter Mack noticed during stress testing on 2 socket Sapphire Rapids
system, there were anomalies where tasks were starved for more
than 70 secs before getting scheduled.
The stress test scenario is an extreme case where about 50 threads
per CPU are started on each core. Then each thread hops from
one core to another continuously.
We discussed this issue with Peter Z., who narrowed
things down to problem with vruntime setting of a migrated
task being too out of sync with the tasks on the target run queue.
Peter suggested the following two patches that did fix
the starvation anomalies that Walter saw.
Yu Chen also kicked the patches into our 0-day test infrastructure to
check for regressions. The performance changes of note are below:
5.15 Throughput 5.15+patchest Test
Changes
4634070 -7.5% 4285823 stress-ng.sigsuspend.ops_per_sec
29934 +37.0% 41006 aim7.jobs-per-min
Stress-ng sigsuspend is the worst affected. But for most workloads,
they are not negatively impacted. In fact, we saw 37% improvement
in Aim7 due to these patches.
Tim
Peter Zijlstra (1):
sched/fair: Don't rely on ->exec_start for migration
Peter Zijlstra (Intel) (1):
sched/fair: Simple runqueue order on migrate
include/linux/sched.h | 1 +
kernel/sched/fair.c | 37 +++++++++++++++++++++++++++++++++----
kernel/sched/features.h | 2 ++
3 files changed, 36 insertions(+), 4 deletions(-)
--
2.32.0
Powered by blists - more mailing lists