lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jan 2008 23:06:41 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Colin Fowler <elethiomel@...il.com>
Cc:	linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: Performance loss 2.6.22->22.6.23->2.6.24-rc7 on CPU intensive
	benchmark on 8 Core Xeon


* Colin Fowler <elethiomel@...il.com> wrote:

> These data may be much better for you. It's a single 15 second data 
> collection run only when the actual ray-tracing is happening. These 
> data do not therefore cover the data structure building phase.
> 
> http://vangogh.cs.tcd.ie/fowler/cfs2/

hm, the system has considerable idle time left:

 r  b swpd   free   buff  cache   si so  bi bo   in    cs  us sy id wa
 8  0    0 1201920 683840 1039100  0  0   3  2   27    46   1  0 99  0
 2  0    0 1202168 683840 1039112  0  0   0  0  245 45339  80  2 17  0
 2  0    0 1202168 683840 1039112  0  0   0  0  263 47349  84  3 14  0
 2  0    0 1202300 683848 1039112  0  0   0 76  255 47057  84  3 13  0

and context-switches 45K times a second. Do you know what is going on 
there? I thought ray-tracing is something that can be parallelized 
pretty efficiently, without having to contend and schedule too much.

could you try to do a similar capture on 2.6.22 as well (under the same 
phase of the same workload), as comparison?

there are a handful of 'scheduler feature bits' in 
/proc/sys/kernel/sched_features:

enum {
        SCHED_FEAT_NEW_FAIR_SLEEPERS    = 1,
        SCHED_FEAT_WAKEUP_PREEMPT       = 2,
        SCHED_FEAT_START_DEBIT          = 4,
        SCHED_FEAT_TREE_AVG             = 8,
        SCHED_FEAT_APPROX_AVG           = 16,
};

const_debug unsigned int sysctl_sched_features =
                SCHED_FEAT_NEW_FAIR_SLEEPERS    * 1 |
                SCHED_FEAT_WAKEUP_PREEMPT       * 1 |
                SCHED_FEAT_START_DEBIT          * 1 |
                SCHED_FEAT_TREE_AVG             * 0 |
                SCHED_FEAT_APPROX_AVG           * 0;

[as of 2.6.24-rc7]

could you try to turn some of them off/on. In particular toggling 
WAKEUP_PREEMPT might have an effect, and NEW_FAIR_SLEEPERS might have an 
effect as well. (TREE_AVG and APPROX_AVG has probably little effect)

other debug-tunables you might want to look into are in the 
/proc/sys/kernel/sched_domains hierarchy.

also, if you toggle:

  /sys/devices/system/cpu/sched_mc_power_savings

does that change the results?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ