linux-kernel - Re: [REPORT] cfs-v6-rc2 vs sd-0.46 vs 2.6.21-rc7

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070426120723.GA4092@elte.hu>
Date:	Thu, 26 Apr 2007 14:07:23 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Michael Gerdau <mgd@...hnosis.de>
Cc:	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Nick Piggin <npiggin@...e.de>,
	Gene Heskett <gene.heskett@...il.com>,
	Juliusz Chroboczek <jch@....jussieu.fr>,
	Mike Galbraith <efault@....de>,
	Peter Williams <pwil3058@...pond.net.au>,
	ck list <ck@....kolivas.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	William Lee Irwin III <wli@...omorphy.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Bill Davidsen <davidsen@....com>, Willy Tarreau <w@....eu>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [REPORT] cfs-v6-rc2 vs sd-0.46 vs 2.6.21-rc7

* Michael Gerdau <mgd@...hnosis.de> wrote:

> Hi list,
> 
> find below a test comparing
>     2.6.21-rc7 (mainline)
>     2.6.21-rc7-sd046
>     2.6.21-rc7-cfs-v6-rc2(*) (X @ nice 0)
>     2.6.21-rc7-cfs-v6-rc2(*) (X @ nice -10)
> running on a dualcore x86_64.

thanks for the testing!

as a summary: i think your numbers demonstrate it nicely that the 
shorter 'timeslice length' that both CFS and SD utilizes does not have a 
measurable negative impact on your workload. To measure the total impact 
of 'timeslicing' you might want to try the exact same workload with a 
much higher 'timeslice length' of say 400 msecs, via:

    echo 400000000 > /proc/sys/kernel/sched_granularity_ns  # on CFS
    echo 400 > /proc/sys/kernel/rr_interval                 # on SD

your existing numbers are a bit hard to analyze because the 3 workloads 
were started at the same time and they overlapped differently and 
utilized the system differently.

i think the primary number that makes sense to look at (which is perhaps 
the least sensitive to the 'overlap effect') is the 'combined user times 
of all 3 workloads' (in order of performance):

> 2.6.21-rc7:                              20589.423    100.00%
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice -10):    20613.845     99.88%
> 2.6.21-rc7-sd046:                        20617.945     99.86%
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice 0):      20743.564     99.25%

to me this gives the impression that it's all "within noise". In 
particular the two CFS results suggest that there's at least a ~100 
seconds noise in these results, because the renicing of X should have no 
impact on the result (the workloads are pure number-crunchers, and all 
use up the CPUs 100%, correct?), and even if it has an impact, renicing 
X to nice 0 should _speed up_ the result - not slow it down a bit like 
the numbers suggest.

another (perhaps less reliable) number is the total wall-clock runtime 
of all 3 jobs. Provided i did not make any mistakes in my calculations, 
here are the results:

> 2.6.21-rc7-sd046:                        10512.611 seconds
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice -10):    10605.946 seconds
> 2.6.21-rc7:                              10650.535 seconds
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice 0):      10788.125 seconds

(the numbers are lower than the first numbers because this is a 2 CPU 
system)

both SD and CFS-nice-10 was faster than mainline, but i'd say this too 
is noise - especially because this result highly depends on the way the 
workloads overlap in general, which seems to be different for SD.

system time is interesting too:

> 2.6.21-rc7:                              35.379
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice -10):    40.399
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice 0):      44.239
> 2.6.21-rc7-sd046:                        45.515

here too the two CFS results seem to suggest that there's at least 
around 5 seconds of noise. So i'd not necessarily call it systematic 
that vanilla had the lowest system time and SD had the highest.

combined system+user time:

> 2.6.21-rc7:                              20624.802
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice 0):      20658.084
> 2.6.21-rc7-sd046:                        20663.460
> 2.6.21-rc7-cfs-v6-rc2 (X @ nice -10):    20783.963

perhaps it might make more sense to run the workloads serialized, to 
have better comparabality of the individual workloads. (on a real system 
you'd naturally want to overlap these workloads to utilize the CPUs, so 
the numbers you did are very relevant too.)

The vmstat suggested there is occasional idle time in the system - is 
the workload IO-bound (or memory bound) in those cases?

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/