lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Jul 2010 11:59:13 -0700
From:	Nikhil Rao <ncrao@...gle.com>
To:	Mike Galbraith <efault@....de>
Cc:	Ingo Molnar <mingo@...e.hu>, Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	Venkatesh Pallipadi <venki@...gle.com>,
	Ken Chen <kenchen@...gle.com>, Paul Turner <pjt@...gle.com>
Subject: Re: [PATCH 0/6] [RFC] Large weight differential leads to inefficient 
	load balancing

On Fri, Jul 30, 2010 at 6:32 AM, Mike Galbraith <efault@....de> wrote:
> On Thu, 2010-07-29 at 22:19 -0700, Nikhil Rao wrote:
>> Hi all,
>>
>> We have observed that a large weight differential between tasks on a runqueue
>> leads to sub-optimal machine utilization and poor load balancing. For example,
>> if you have lots of SCHED_IDLE tasks (sufficient number to keep the machine 100%
>> busy) and a few SCHED_NORMAL soaker tasks, we see that the machine has
>> significant idle time.
>>
>> The data below highlights this problem. The test machine is a 4 socket quad-core
>> box (16 cpus). These experiemnts were done with v2.6.25-rc6. We spawn 16
>> SCHED_IDLE soaker threads (one per-cpu) to completely fill up the machine. CPU
>> utilization numbers gathered from mpstat for 10s are:
>>
>> 03:30:24 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
>> 03:30:25 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00  16234.65
>> 03:30:26 PM  all   99.88    0.06    0.06    0.00    0.00    0.00    0.00    0.00  16374.00
>> 03:30:27 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00  16392.00
>> 03:30:28 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00  16612.12
>> 03:30:29 PM  all   99.88    0.00    0.12    0.00    0.00    0.00    0.00    0.00  16375.00
>> 03:30:30 PM  all   99.94    0.06    0.00    0.00    0.00    0.00    0.00    0.00  16440.00
>> 03:30:31 PM  all   99.81    0.00    0.19    0.00    0.00    0.00    0.00    0.00  16237.62
>> 03:30:32 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00  16360.00
>> 03:30:33 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00  16405.00
>> 03:30:34 PM  all   99.38    0.06    0.50    0.00    0.00    0.00    0.00    0.06  18881.82
>> Average:     all   99.86    0.02    0.12    0.00    0.00    0.00    0.00    0.01  16628.20
>>
>> We then spawn one SCHED_NORMAL while-1 task (the absolute number does not matter
>> so long as we introduce some large weight differential).
>>
>> 03:40:57 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
>> 03:40:58 PM  all   83.06    0.00    0.06    0.00    0.00    0.00    0.00   16.88  14555.00
>> 03:40:59 PM  all   78.25    0.00    0.06    0.00    0.00    0.00    0.00   21.69  14527.00
>> 03:41:00 PM  all   82.71    0.06    0.06    0.00    0.00    0.00    0.00   17.17  14879.00
>> 03:41:01 PM  all   87.34    0.00    0.06    0.00    0.00    0.00    0.00   12.59  15466.00
>> 03:41:02 PM  all   80.80    0.06    0.19    0.00    0.00    0.00    0.00   18.95  14584.00
>> 03:41:03 PM  all   82.90    0.00    0.06    0.00    0.00    0.00    0.00   17.04  14570.00
>> 03:41:04 PM  all   79.45    0.00    0.06    0.00    0.00    0.00    0.00   20.49  14536.00
>> 03:41:05 PM  all   86.48    0.00    0.07    0.00    0.00    0.00    0.00   13.46  14577.00
>> 03:41:06 PM  all   76.73    0.06    0.06    0.00    0.00    0.06    0.00   23.10  14594.00
>> 03:41:07 PM  all   86.48    0.00    0.07    0.00    0.00    0.00    0.00   13.45  14703.03
>> Average:     all   82.31    0.02    0.08    0.00    0.00    0.01    0.00   17.59  14699.10
>
> What happens with s/SCHED_IDLE/nice 19?
>
>        -Mike

We see the same result with nice 19 as well.

w/ 16 nice-19 soakers:

10:15:16 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft
%steal   %idle    intr/s
10:15:17 AM  all    0.06   99.94    0.00    0.00    0.00    0.00
0.00    0.00  16296.04
10:15:18 AM  all    0.00   99.94    0.06    0.00    0.00    0.00
0.00    0.00  16379.00
10:15:19 AM  all    0.00   99.94    0.06    0.00    0.00    0.00
0.00    0.00  16414.00
10:15:20 AM  all    0.00   99.94    0.06    0.00    0.00    0.00
0.00    0.00  16413.00
10:15:21 AM  all    0.00  100.00    0.00    0.00    0.00    0.00
0.00    0.00  16402.00
10:15:22 AM  all    0.00   99.88    0.06    0.00    0.00    0.06
0.00    0.00  16419.00
10:15:23 AM  all    0.00   99.94    0.06    0.00    0.00    0.00
0.00    0.00  16406.00
10:15:24 AM  all    0.19   99.69    0.12    0.00    0.00    0.00
0.00    0.00  16613.13
10:15:25 AM  all    0.38   99.31    0.31    0.00    0.00    0.00
0.00    0.00  16313.86
10:15:26 AM  all    0.50   99.31    0.19    0.00    0.00    0.00
0.00    0.00  16623.23
Average:     all    0.11   99.79    0.09    0.00    0.00    0.01
0.00    0.00  16427.30

w/ adding a SCHED_NORMAL soaker to the mix:

10:17:44 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft
%steal   %idle    intr/s
10:17:45 AM  all    6.20   74.38    0.06    0.00    0.00    0.00
0.00   19.35  14419.80
10:17:46 AM  all    6.25   74.89    0.06    0.00    0.00    0.00
0.00   18.80  14619.00
10:17:47 AM  all    6.30   74.84    0.06    0.00    0.00    0.00
0.00   18.79  14590.00
10:17:48 AM  all    6.25   80.57    0.06    0.00    0.00    0.00
0.00   13.12  15511.00
10:17:49 AM  all    6.51   80.33    0.07    0.00    0.00    0.00
0.00   13.09  14904.00
10:17:50 AM  all    6.06   72.62    0.06    0.00    0.00    0.00
0.00   21.26  14564.00
10:17:51 AM  all    6.21   74.47    0.06    0.00    0.00    0.00
0.00   19.25  14584.00
10:17:52 AM  all    6.47   77.67    0.12    0.00    0.00    0.00
0.00   15.73  15295.96
10:17:53 AM  all    6.27   79.39    0.06    0.00    0.00    0.00
0.00   14.29  15251.00
10:17:54 AM  all    6.32   75.85    0.00    0.00    0.00    0.00
0.00   17.83  14537.00
Average:     all    6.28   76.47    0.06    0.00    0.00    0.00
0.00   17.18  14826.70

The problem is the large weight differential between nice
19/SCHED_IDLE and SCHED_NORMAL. I ran a quick experiment with the
soaker tasks at different nice levels. Data is in the table below.
First column is nice level, second is idle% on the machine (mpstat 10s
average) and third is ratio of nice weight/1024.

0          0.00          1
1          0.00          0.800781
2          0.00          0.639648
3          0.00          0.513672
4          0.00          0.413086
5          0.17          0.327148
6          1.06          0.265625
7          7.62          0.209961
8          4.47          0.167969
9          11.78          0.133789
10          13.52          0.107422
11          14.92          0.0849609
12          14.33          0.0683594
13          17.47          0.0546875
14          15.89          0.0439453
15          18.69          0.0351562
16          16.63          0.0283203
17          17.04          0.0224609
18          17.86          0.0175781
19          18.13          0.0146484

It looks like we start seeing seeing sub-optimal performance when the
weight ratio is >0.3.

-Thanks,
Nikhil
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ