[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <56A8D994.6050205@redhat.com>
Date: Wed, 27 Jan 2016 15:52:04 +0100
From: Jan Stancek <jstancek@...hat.com>
To: alex.shi@...el.com, guz.fnst@...fujitsu.com, peterz@...radead.org,
mingo@...hat.com, jolsa@...hat.com, riel@...hat.com,
linux-kernel@...r.kernel.org
Subject: [BUG] scheduler doesn't balance thread to idle cpu for 3 seconds
Hello,
pthread_cond_wait_1/2 [1] is rarely failing for me on 4.5.0-rc1,
on x86_64 KVM guest with 2 CPUs.
This test [1]:
- spawns 2 SCHED_RR threads
- first thread with higher priority sets alarm for 2 seconds and blocks on condition
- second thread with lower priority is busy looping for 5 seconds
- after 2 seconds alarm signal arrives and handler signals condition
- high priority thread should resume running
But rarely I see that high priority thread doesn't resume running until
low priority thread completes its 5 second busy loop.
Looking at traces (short version attached, long version at [2]),
I see that after 2 seconds scheduler tries to wake up main thread, but it
appears to do that on same CPU where SCHED_RR low prio thread is running,
so nothing happens. Then scheduler makes numerous balance attempts,
but main thread is not balanced to idle CPU.
My guess is this started with following commit, which changed weighted_cpuload():
commit b92486cbf2aa230d00f160664858495c81d2b37b
Author: Alex Shi <alex.shi@...el.com>
Date: Thu Jun 20 10:18:50 2013 +0800
sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
I could reproduce it with HEAD set at above commit, I couldn't reproduce it
with 3.10 kernel so far.
Regards,
Jan
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/functional/threads/condvar/pthread_cond_wait_1.c
[2] http://jan.stancek.eu/tmp/pthread_cond_wait_failure/sched-trace1.tar.bz2
View attachment "sched-trace1-short-v4.5-rc1" of type "text/plain" (5410 bytes)
Powered by blists - more mailing lists