lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56AA39D6.4070509@redhat.com>
Date:	Thu, 28 Jan 2016 16:55:02 +0100
From:	Jan Stancek <jstancek@...hat.com>
To:	alex.shi@...el.com, guz.fnst@...fujitsu.com, peterz@...radead.org,
	mingo@...hat.com, jolsa@...hat.com, riel@...hat.com,
	linux-kernel@...r.kernel.org
Cc:	jstancek@...hat.com
Subject: Re: [BUG] scheduler doesn't balance thread to idle cpu for 3 seconds

On 01/27/2016 03:52 PM, Jan Stancek wrote:
> Hello,
> 
> pthread_cond_wait_1/2 [1] is rarely failing for me on 4.5.0-rc1,
> on x86_64 KVM guest with 2 CPUs.
> 
> This test [1]:
> - spawns 2 SCHED_RR threads
> - first thread with higher priority sets alarm for 2 seconds and blocks on condition
> - second thread with lower priority is busy looping for 5 seconds
> - after 2 seconds alarm signal arrives and handler signals condition
> - high priority thread should resume running

I have slightly modified testcase, so it will finish immediately when high prio
thread is done. And also to allow it to compile outside of openposix testsuite.

Testcase is attached. I'm running it in following way:

gcc -O2 -pthread pthread_cond_wait_1.c
while [ True ]; do
  time ./a.out
  sleep 1
done

for couple thousand iterations. About half of those are
on system booted with init=/bin/bash.

> 
> But rarely I see that high priority thread doesn't resume running until
> low priority thread completes its 5 second busy loop.
> 
> Looking at traces (short version attached, long version at [2]),
> I see that after 2 seconds scheduler tries to wake up main thread, but it
> appears to do that on same CPU where SCHED_RR low prio thread is running,
> so nothing happens. Then scheduler makes numerous balance attempts,
> but main thread is not balanced to idle CPU.
> 
> My guess is this started with following commit, which changed weighted_cpuload():
>   commit b92486cbf2aa230d00f160664858495c81d2b37b
>   Author: Alex Shi <alex.shi@...el.com>
>   Date:   Thu Jun 20 10:18:50 2013 +0800
>     sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task

Here are some numbers gathered from kernels with HEAD at b92486c and
previous commit 83dfd52. System is 2 CPU KVM guest.

Each iteration measures how long it took for testcase to finish.
Ideally it should take about 2 seconds.

1. HEAD at 83dfd52 sched: Update cpu load after task_tick

  finish time [s]  |   iterations
----------------------------------
[    2,   2.2]     |       3134
[  2.2,   2.5]     |         18
[  2.5,     3]     |          0
[    3,     4]     |          0
[    4,     5]     |          0
[    5,   999]     |          0


2. HEAD at b92486c sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task

  finish time [s]  |   iterations
----------------------------------
[    2,   2.2]     |       1617
[  2.2,   2.5]     |         38
[  2.5,     3]     |        727
[    3,     4]     |        399
[    4,     5]     |         17
[    5,   999]     |         11

Regards,
Jan

> 
> I could reproduce it with HEAD set at above commit, I couldn't reproduce it
> with 3.10 kernel so far.
> 
> Regards,
> Jan
> 
> [1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/functional/threads/condvar/pthread_cond_wait_1.c
> [2] http://jan.stancek.eu/tmp/pthread_cond_wait_failure/sched-trace1.tar.bz2
> 


View attachment "pthread_cond_wait_1.c" of type "text/plain" (6064 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ