linux-kernel - Re: VolanoMark regression with 2.6.27-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <1219981101.8781.123.camel@ymzhang>
Date:	Fri, 29 Aug 2008 11:38:21 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Aneesh Kumar KV <aneesh.kumar@...ux.vnet.ibm.com>,
	Balbir Singh <balbir@...ibm.com>,
	Chris Friesen <cfriesen@...tel.com>
Subject: Re: VolanoMark regression with 2.6.27-rc1


On Fri, 2008-08-29 at 11:35 +0800, Zhang, Yanmin wrote:
> On Thu, 2008-08-21 at 14:48 +0800, Zhang, Yanmin wrote:
> > On Thu, 2008-08-21 at 08:16 +0200, Ingo Molnar wrote:
> > > * Zhang, Yanmin <yanmin_zhang@...ux.intel.com> wrote:
> > > 
> > > > > ok, i've applied this one to tip/sched/urgent instead of the 
> > > > > feature-disabling patchlet. Yanmin, could you please check whether this 
> > > > > one does the trick?
> > > >
> > > > This new patch almost doesn't help volanoMark. Pls. use the patch 
> > > > which sets LB_BIAS=1 by default.
> > > 
> > > ok. That also removes the kernel.h complications ;-)
> > Sorry, I have new update.
> > Originally, I worked on 2.6.27-rc1. I just move to 2.6.27-rc3 and found
> > something defferent when CONFIG_GROUP_SCHED=n.
> > 
> > With 2.6.27-rc3, on my 8-core stoakley, all volanoMark regression disappears,
> > no matter if I enable LB_BIAS. On 16-core tigerton, the regression is still
> > there if I don't enable LB_BIAS and regression becomes 11% from 65%.
> I have new updates on this regression. I checked volanoMark web page and
> found the client command line has option rooms and users. rooms means how many
> chat room will be started. users means how many users are in 1 room. The default
> rooms is 10 and users is 20, so every room has about 800 threads.
Sorry. every room has 80 threads.

>  As all threads of a
> room just communicate within this room, so the rooms number is important.
> 
> All my previous volanoMark testing uses default rooms 10 and users 20. With wake_offine
> in kernel, waker/sleeper will be moved to the same cpu gradually. However, if the
> rooms is not multiple of cpu number, due to load balance, kernel will move threads from
> one cpu to another cpu continually. If there are too many threads to weaken the cache-hot
> effect, load balance is more important. But if there are not too many threads running,
> cache-hot is more important than load balance. Should we prefer to wake_affine more?
> 
> Below is some data I collected with numerous testing on 3 machines.
> 
> 
> On 2-quadcore processor stoakley (8-core):
> kernel\rooms           |      8        |      10      |      16    |     32
> -------------------------------------------------------------------------------------------
> 2.6.26_nogroup         |   385617      |    351247    |	323324	   |  231934
> -------------------------------------------------------------------------------------------
> 2.6.27-rc4_nogroup     |    359124     |    336984    |	335180     |  235258
> -------------------------------------------------------------------------------------------
> 2.6.26group            |    381425     |    343636    |  312280    |  179673
> -------------------------------------------------------------------------------------------
> 2.6.27-rc4group        |   212112      |   270000     |	300188	    |  228465
> -------------------------------------------------------------------------------------------
> 
> 
> On 2-quadcore+HT processor new x86_64 (8-core+HT, total 16 threads):
> kernel\rooms        |    10    |   16    |   24    |   32     |   64
> -------------------------------------------------------------------------
> 2.6.26_nogroup      |  667668  | 671860  | 671662  |  621900  | 509482
> -------------------------------------------------------------------------
> 2.6.27-rc4_nogroup  |  732346  |  800290 | 709272  |  648561  | 497243
> -------------------------------------------------------------------------
> 2.6.26group         |  705579  |  759464 | 693697  |  636019  | 500744
> -------------------------------------------------------------------------
> 2.6.27-rc4group     |  572426  |  674977 | 627410  |  590984  | 445651
> -------------------------------------------------------------------------
> 
> 
> On 4-quadcore tigerton processor(16-core)(32 rooms testing isn't stable on the machine, so no 32):
> kernel\rooms           |      8        |      10      |   16   
> ------------------------------------------------------------------
> 2.6.26_nogroup         |   346410      |    382938    | 349405
> ------------------------------------------------------------------
> 2.6.27-rc4_nogroup     |    359124     |    336984    |	335180    
> ------------------------------------------------------------------
> 2.6.26group            |    504802     |    376513    | 319020   
> ------------------------------------------------------------------
> 2.6.27-rc4group        |   247652      |   284784     | 355132	   
> ------------------------------------------------------------------
> 
> I also tried different users with rooms 8 and found the results of users 20/40/60 are very close.
> 
> With group scheduing, mostly, 2.6.26 is better than 2.6.27-rc4.
> Without group scheduling, the result depends on specific machine.
> 
> I also rerun hackbench with group 10/16/32, and found the result difference between 2 kernels
> varies among group 10/16/32.
> 
> What's the most reasonable group/rooms we should use to test?
> 
> In the other hand, tbench(start CPU_NUM*2 clients) has about 4~5% regression with 2.6.27-rc kernels.
> With 30second schedstat data during the testing, I found there is almost no wake remote and wake
> affine with 2.6.26, but there are many either wake_affine or wake remote with 2.6.27-rc.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/