[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1225010790.8566.22.camel@marge.simson.net>
Date: Sun, 26 Oct 2008 09:46:30 +0100
From: Mike Galbraith <efault@....de>
To: Jiri Kosina <jkosina@...e.cz>
Cc: David Miller <davem@...emloft.net>, rjw@...k.pl,
Ingo Molnar <mingo@...e.hu>, s0mbre@...rvice.net.ru,
a.p.zijlstra@...llo.nl, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [tbench regression fixes]: digging out smelly deadmen.
On Sun, 2008-10-26 at 01:10 +0200, Jiri Kosina wrote:
> On Sat, 25 Oct 2008, David Miller wrote:
>
> > But note that tbench performance improved a bit in 2.6.25.
> > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
> > weird.
> > Just for the public record here are the numbers I got in my testing.
>
> I have been currently looking at very similarly looking issue. For the
> public record, here are the numbers we have been able to come up with so
> far (measured with dbench, so the absolute values are slightly different,
> but still shows similar pattern)
>
> 208.4 MB/sec -- vanilla 2.6.16.60
> 201.6 MB/sec -- vanilla 2.6.20.1
> 172.9 MB/sec -- vanilla 2.6.22.19
> 74.2 MB/sec -- vanilla 2.6.23
> 46.1 MB/sec -- vanilla 2.6.24.2
> 30.6 MB/sec -- vanilla 2.6.26.1
>
> I.e. huge drop for 2.6.23 (this was with default configs for each
> respective kernel).
> 2.6.23-rc1 shows 80.5 MB/s, i.e. a few % better than final 2.6.23, but
> still pretty bad.
>
> I have gone through the commits that went into -rc1 and tried to figure
> out which one could be responsible. Here are the numbers:
>
> 85.3 MB/s for 2ba2d00363 (just before on-deman readahead has been merged)
> 82.7 MB/s for 45426812d6 (before cond_resched() has been added into page
> 187.7 MB/s for c1e4fe711a4 (just before CFS scheduler has been merged)
> invalidation code)
>
> So the current bigest suspect is CFS, but I don't have enough numbers yet
> to be able to point a finger to it with 100% certainity. Hopefully soon.
Hi,
High client count right?
I reproduced this on my Q6600 box. However, I also reproduced it with
2.6.22.19. What I think you're seeing is just dbench creating a massive
train wreck. With CFS, it appears to be more likely to start->end
_sustain_, but the wreckage is present in O(1) scheduler runs as well,
and will start->end sustain there as well.
2.6.22.19-smp Throughput 967.933 MB/sec 16 procs Throughput 147.879 MB/sec 160 procs
Throughput 950.325 MB/sec 16 procs Throughput 349.959 MB/sec 160 procs
Throughput 953.382 MB/sec 16 procs Throughput 126.821 MB/sec 160 procs <== massive jitter
2.6.22.19-cfs-v24.1-smp Throughput 978.047 MB/sec 16 procs Throughput 170.662 MB/sec 160 procs
Throughput 943.254 MB/sec 16 procs Throughput 39.388 MB/sec 160 procs <== sustained train wreck
Throughput 934.042 MB/sec 16 procs Throughput 239.574 MB/sec 160 procs
2.6.23.17-smp Throughput 1173.97 MB/sec 16 procs Throughput 100.996 MB/sec 160 procs
Throughput 1122.85 MB/sec 16 procs Throughput 80.3747 MB/sec 160 procs
Throughput 1113.60 MB/sec 16 procs Throughput 99.3723 MB/sec 160 procs
2.6.24.7-smp Throughput 1030.34 MB/sec 16 procs Throughput 256.419 MB/sec 160 procs
Throughput 970.602 MB/sec 16 procs Throughput 257.008 MB/sec 160 procs
Throughput 1056.48 MB/sec 16 procs Throughput 248.841 MB/sec 160 procs
2.6.25.19-smp Throughput 955.874 MB/sec 16 procs Throughput 40.5735 MB/sec 160 procs
Throughput 943.348 MB/sec 16 procs Throughput 62.3966 MB/sec 160 procs
Throughput 937.595 MB/sec 16 procs Throughput 17.4639 MB/sec 160 procs
2.6.26.7-smp Throughput 904.564 MB/sec 16 procs Throughput 118.364 MB/sec 160 procs
Throughput 891.824 MB/sec 16 procs Throughput 34.2193 MB/sec 160 procs
Throughput 880.850 MB/sec 16 procs Throughput 22.4938 MB/sec 160 procs
2.6.27.4-smp Throughput 856.660 MB/sec 16 procs Throughput 168.243 MB/sec 160 procs
Throughput 880.121 MB/sec 16 procs Throughput 120.132 MB/sec 160 procs
Throughput 880.121 MB/sec 16 procs Throughput 142.105 MB/sec 160 procs
Check out fugliness:
2.6.22.19-smp Throughput 35.5075 MB/sec 160 procs (start->end sustained train wreck)
Full output from above run:
dbench version 3.04 - Copyright Andrew Tridgell 1999-2004
Running for 60 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 12 secs
160 clients started
160 54 310.43 MB/sec warmup 1 sec
160 54 155.18 MB/sec warmup 2 sec
160 54 103.46 MB/sec warmup 3 sec
160 54 77.59 MB/sec warmup 4 sec
160 56 64.81 MB/sec warmup 5 sec
160 57 54.01 MB/sec warmup 6 sec
160 57 46.29 MB/sec warmup 7 sec
160 812 129.07 MB/sec warmup 8 sec
160 1739 205.08 MB/sec warmup 9 sec
160 2634 262.22 MB/sec warmup 10 sec
160 3437 305.41 MB/sec warmup 11 sec
160 3815 307.35 MB/sec warmup 12 sec
160 4241 311.07 MB/sec warmup 13 sec
160 5142 344.02 MB/sec warmup 14 sec
160 5991 369.46 MB/sec warmup 15 sec
160 6346 369.09 MB/sec warmup 16 sec
160 6347 347.97 MB/sec warmup 17 sec
160 6347 328.66 MB/sec warmup 18 sec
160 6348 311.50 MB/sec warmup 19 sec
160 6348 0.00 MB/sec execute 1 sec
160 6348 2.08 MB/sec execute 2 sec
160 6349 2.75 MB/sec execute 3 sec
160 6356 16.25 MB/sec execute 4 sec
160 6360 17.21 MB/sec execute 5 sec
160 6574 45.07 MB/sec execute 6 sec
160 6882 76.17 MB/sec execute 7 sec
160 7006 86.37 MB/sec execute 8 sec
160 7006 76.77 MB/sec execute 9 sec
160 7006 69.09 MB/sec execute 10 sec
160 7039 68.67 MB/sec execute 11 sec
160 7043 64.71 MB/sec execute 12 sec
160 7044 60.29 MB/sec execute 13 sec
160 7044 55.98 MB/sec execute 14 sec
160 7057 56.13 MB/sec execute 15 sec
160 7057 52.63 MB/sec execute 16 sec
160 7059 50.21 MB/sec execute 17 sec
160 7083 49.73 MB/sec execute 18 sec
160 7086 48.05 MB/sec execute 19 sec
160 7088 46.40 MB/sec execute 20 sec
160 7088 44.19 MB/sec execute 21 sec
160 7094 43.59 MB/sec execute 22 sec
160 7094 41.69 MB/sec execute 23 sec
160 7094 39.96 MB/sec execute 24 sec
160 7094 38.36 MB/sec execute 25 sec
160 7094 36.88 MB/sec execute 26 sec
160 7094 35.52 MB/sec execute 27 sec
160 7098 34.91 MB/sec execute 28 sec
160 7124 36.72 MB/sec execute 29 sec
160 7124 35.50 MB/sec execute 30 sec
160 7124 34.35 MB/sec execute 31 sec
160 7124 33.28 MB/sec execute 32 sec
160 7124 32.27 MB/sec execute 33 sec
160 7124 31.32 MB/sec execute 34 sec
160 7283 34.80 MB/sec execute 35 sec
160 7681 44.95 MB/sec execute 36 sec
160 7681 43.79 MB/sec execute 37 sec
160 7681 42.64 MB/sec execute 38 sec
160 7689 42.23 MB/sec execute 39 sec
160 7691 41.48 MB/sec execute 40 sec
160 7693 40.76 MB/sec execute 41 sec
160 7703 40.54 MB/sec execute 42 sec
160 7704 39.81 MB/sec execute 43 sec
160 7704 38.91 MB/sec execute 44 sec
160 7704 38.04 MB/sec execute 45 sec
160 7704 37.21 MB/sec execute 46 sec
160 7704 36.42 MB/sec execute 47 sec
160 7704 35.66 MB/sec execute 48 sec
160 7747 36.58 MB/sec execute 49 sec
160 7854 38.00 MB/sec execute 50 sec
160 7857 37.65 MB/sec execute 51 sec
160 7861 37.29 MB/sec execute 52 sec
160 7862 36.67 MB/sec execute 53 sec
160 7864 36.21 MB/sec execute 54 sec
160 7877 35.85 MB/sec execute 55 sec
160 7877 35.21 MB/sec execute 56 sec
160 8015 37.11 MB/sec execute 57 sec
160 8019 36.57 MB/sec execute 58 sec
160 8019 35.95 MB/sec execute 59 sec
160 8019 35.36 MB/sec cleanup 60 sec
160 8019 34.78 MB/sec cleanup 61 sec
160 8019 34.23 MB/sec cleanup 63 sec
160 8019 33.69 MB/sec cleanup 64 sec
160 8019 33.16 MB/sec cleanup 65 sec
160 8019 32.65 MB/sec cleanup 66 sec
160 8019 32.21 MB/sec cleanup 67 sec
160 8019 31.73 MB/sec cleanup 68 sec
160 8019 31.27 MB/sec cleanup 69 sec
160 8019 30.84 MB/sec cleanup 70 sec
160 8019 30.40 MB/sec cleanup 71 sec
160 8019 29.98 MB/sec cleanup 72 sec
160 8019 29.58 MB/sec cleanup 73 sec
160 8019 29.18 MB/sec cleanup 74 sec
160 8019 29.03 MB/sec cleanup 74 sec
Throughput 35.5075 MB/sec 160 procs
Throughput 180.934 MB/sec 160 procs (next run, non-sustained train wreck)
Full output of this run:
dbench version 3.04 - Copyright Andrew Tridgell 1999-2004
Running for 60 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 12 secs
160 clients started
160 67 321.43 MB/sec warmup 1 sec
160 67 160.61 MB/sec warmup 2 sec
160 67 107.04 MB/sec warmup 3 sec
160 67 80.27 MB/sec warmup 4 sec
160 67 64.21 MB/sec warmup 5 sec
160 267 89.74 MB/sec warmup 6 sec
160 1022 169.68 MB/sec warmup 7 sec
160 1821 240.62 MB/sec warmup 8 sec
160 2591 290.39 MB/sec warmup 9 sec
160 3125 308.04 MB/sec warmup 10 sec
160 3125 280.04 MB/sec warmup 11 sec
160 3217 263.23 MB/sec warmup 12 sec
160 3725 276.45 MB/sec warmup 13 sec
160 4237 288.32 MB/sec warmup 14 sec
160 4748 300.98 MB/sec warmup 15 sec
160 4810 286.69 MB/sec warmup 16 sec
160 4812 270.89 MB/sec warmup 17 sec
160 4812 255.95 MB/sec warmup 18 sec
160 4812 242.48 MB/sec warmup 19 sec
160 4812 230.35 MB/sec warmup 20 sec
160 4812 219.38 MB/sec warmup 21 sec
160 4812 209.41 MB/sec warmup 22 sec
160 4812 200.31 MB/sec warmup 23 sec
160 4812 191.96 MB/sec warmup 24 sec
160 4812 184.28 MB/sec warmup 25 sec
160 4812 177.19 MB/sec warmup 26 sec
160 4836 175.89 MB/sec warmup 27 sec
160 4836 169.61 MB/sec warmup 28 sec
160 4841 163.97 MB/sec warmup 29 sec
160 5004 163.03 MB/sec warmup 30 sec
160 5450 170.58 MB/sec warmup 31 sec
160 5951 178.79 MB/sec warmup 32 sec
160 6086 176.86 MB/sec warmup 33 sec
160 6127 174.53 MB/sec warmup 34 sec
160 6129 169.67 MB/sec warmup 35 sec
160 6131 165.36 MB/sec warmup 36 sec
160 6137 161.65 MB/sec warmup 37 sec
160 6141 157.85 MB/sec warmup 38 sec
160 6145 154.32 MB/sec warmup 39 sec
160 6145 150.46 MB/sec warmup 40 sec
160 6145 146.79 MB/sec warmup 41 sec
160 6145 143.30 MB/sec warmup 42 sec
160 6145 139.97 MB/sec warmup 43 sec
160 6145 136.78 MB/sec warmup 44 sec
160 6145 133.74 MB/sec warmup 45 sec
160 6145 130.84 MB/sec warmup 46 sec
160 6145 128.05 MB/sec warmup 47 sec
160 6178 128.41 MB/sec warmup 48 sec
160 6180 126.13 MB/sec warmup 49 sec
160 6184 124.09 MB/sec warmup 50 sec
160 6187 122.03 MB/sec warmup 51 sec
160 6192 120.19 MB/sec warmup 52 sec
160 6196 118.42 MB/sec warmup 53 sec
160 6228 116.88 MB/sec warmup 54 sec
160 6231 114.97 MB/sec warmup 55 sec
160 6231 112.92 MB/sec warmup 56 sec
160 6398 114.17 MB/sec warmup 57 sec
160 6401 112.44 MB/sec warmup 58 sec
160 6402 110.69 MB/sec warmup 59 sec
160 6402 108.84 MB/sec warmup 60 sec
160 6405 107.38 MB/sec warmup 61 sec
160 6405 105.65 MB/sec warmup 62 sec
160 6407 104.03 MB/sec warmup 64 sec
160 6431 103.16 MB/sec warmup 65 sec
160 6432 101.64 MB/sec warmup 66 sec
160 6432 100.10 MB/sec warmup 67 sec
160 6460 99.42 MB/sec warmup 68 sec
160 6698 100.92 MB/sec warmup 69 sec
160 7218 106.21 MB/sec warmup 70 sec
160 7254 36.49 MB/sec execute 1 sec
160 7254 18.24 MB/sec execute 2 sec
160 7259 21.06 MB/sec execute 3 sec
160 7359 37.80 MB/sec execute 4 sec
160 7381 34.05 MB/sec execute 5 sec
160 7381 28.37 MB/sec execute 6 sec
160 7381 24.32 MB/sec execute 7 sec
160 7381 21.28 MB/sec execute 8 sec
160 7404 21.03 MB/sec execute 9 sec
160 7647 43.24 MB/sec execute 10 sec
160 7649 39.94 MB/sec execute 11 sec
160 7672 38.48 MB/sec execute 12 sec
160 7680 37.10 MB/sec execute 13 sec
160 7856 46.09 MB/sec execute 14 sec
160 7856 43.02 MB/sec execute 15 sec
160 7856 40.33 MB/sec execute 16 sec
160 7856 37.99 MB/sec execute 17 sec
160 8561 71.30 MB/sec execute 18 sec
160 9070 92.10 MB/sec execute 19 sec
160 9080 88.86 MB/sec execute 20 sec
160 9086 86.13 MB/sec execute 21 sec
160 9089 82.70 MB/sec execute 22 sec
160 9095 79.98 MB/sec execute 23 sec
160 9098 77.32 MB/sec execute 24 sec
160 9101 74.78 MB/sec execute 25 sec
160 9105 72.70 MB/sec execute 26 sec
160 9107 70.34 MB/sec execute 27 sec
160 9110 68.40 MB/sec execute 28 sec
160 9114 66.60 MB/sec execute 29 sec
160 9114 64.38 MB/sec execute 30 sec
160 9114 62.30 MB/sec execute 31 sec
160 9146 61.31 MB/sec execute 32 sec
160 9493 68.80 MB/sec execute 33 sec
160 10040 80.50 MB/sec execute 34 sec
160 10567 91.12 MB/sec execute 35 sec
160 10908 96.72 MB/sec execute 36 sec
160 11234 101.86 MB/sec execute 37 sec
160 12062 118.23 MB/sec execute 38 sec
160 12987 135.90 MB/sec execute 39 sec
160 13883 152.07 MB/sec execute 40 sec
160 14730 166.18 MB/sec execute 41 sec
160 14829 165.26 MB/sec execute 42 sec
160 14836 162.03 MB/sec execute 43 sec
160 14851 158.64 MB/sec execute 44 sec
160 14851 155.11 MB/sec execute 45 sec
160 14851 151.74 MB/sec execute 46 sec
160 15022 151.70 MB/sec execute 47 sec
160 15292 153.38 MB/sec execute 48 sec
160 15580 155.28 MB/sec execute 49 sec
160 15846 156.73 MB/sec execute 50 sec
160 16449 164.00 MB/sec execute 51 sec
160 17097 171.56 MB/sec execute 52 sec
160 17097 168.32 MB/sec execute 53 sec
160 17310 168.62 MB/sec execute 54 sec
160 18075 177.42 MB/sec execute 55 sec
160 18828 186.31 MB/sec execute 56 sec
160 18876 184.04 MB/sec execute 57 sec
160 18876 180.87 MB/sec execute 58 sec
160 18879 177.81 MB/sec execute 59 sec
160 19294 180.80 MB/sec cleanup 60 sec
160 19294 177.84 MB/sec cleanup 61 sec
160 19294 174.97 MB/sec cleanup 63 sec
160 19294 172.24 MB/sec cleanup 64 sec
160 19294 169.55 MB/sec cleanup 65 sec
160 19294 166.95 MB/sec cleanup 66 sec
160 19294 164.42 MB/sec cleanup 67 sec
160 19294 161.97 MB/sec cleanup 68 sec
160 19294 159.59 MB/sec cleanup 69 sec
160 19294 157.28 MB/sec cleanup 70 sec
160 19294 155.03 MB/sec cleanup 71 sec
160 19294 152.86 MB/sec cleanup 72 sec
160 19294 150.76 MB/sec cleanup 73 sec
160 19294 148.71 MB/sec cleanup 74 sec
160 19294 146.70 MB/sec cleanup 75 sec
160 19294 144.75 MB/sec cleanup 76 sec
160 19294 142.85 MB/sec cleanup 77 sec
160 19294 141.72 MB/sec cleanup 77 sec
Throughput 180.934 MB/sec 160 procs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists