lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 4 Aug 2008 11:23:39 +0530
From:	Dhaval Giani <dhaval@...ux.vnet.ibm.com>
To:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Aneesh Kumar KV <aneesh.kumar@...ux.vnet.ibm.com>,
	Balbir Singh <balbir@...ibm.com>
Subject: Re: VolanoMark regression with 2.6.27-rc1

On Mon, Aug 04, 2008 at 01:37:58PM +0800, Zhang, Yanmin wrote:
> 
> On Mon, 2008-08-04 at 10:52 +0530, Dhaval Giani wrote:
> > On Mon, Aug 04, 2008 at 01:04:38PM +0800, Zhang, Yanmin wrote:
> > > 
> > > On Fri, 2008-08-01 at 10:44 +0530, Dhaval Giani wrote:
> > > > On Fri, Aug 01, 2008 at 08:39:14AM +0800, Zhang, Yanmin wrote:
> > > > > 
> > > > > On Thu, 2008-07-31 at 15:49 +0800, Zhang, Yanmin wrote:
> > > > > > On Thu, 2008-07-31 at 09:39 +0200, Peter Zijlstra wrote:
> > > > > > > On Thu, 2008-07-31 at 15:31 +0800, Zhang, Yanmin wrote:
> > > > > > > > On Thu, 2008-07-31 at 11:20 +0800, Zhang, Yanmin wrote:
> > > > > > > > > Ingo,
> > > > > > > > > 
> > > > > > > > Oh, it looks like they are the old issues in 2.6.26-rc1 and the 2 patches were reverted before 2.6.26.
> > > > > > > > New patches are merged into 2.6.27-rc1, but the issues are still not resolved clearly.
> > > > > > > > http://www.uwsg.iu.edu/hypermail/linux/kernel/0805.2/1148.html.
> > > > > > > 
> > > > > > > The new smp-group stuff doesn't remotely look like what was in .26
> > > > > > > 
> > > > > > > Also, on my quad (admittedly smaller than your machines) both volano and
> > > > > > > sysbench didn't regress anymore - where they clearly did with the code
> > > > > > > reverted from .26.
> > > > > > The regression I reported exists on:
> > > > > > 1) 8-core+HT(totally 16 logical processor) tulsa: 40% regression with volano, 8% with oltp;
> > > > > > 2) 8-core+HT Montvale Itanium: 9% regression with volano; 8% with oltp;
> > > > > > 3) 16-core tigerton: %70 with volano, %18 with oltp;
> > > > > > 4) 8-core stoakley: %15 with oltp, testing failed with volanoMark.
> > > > > > 
> > > > > > So the issues are popular on different architectures.
> > > > > I know kernel needs the features and it might not be a good idea to reject them over and over again.
> > > > > I will collect more data on tigerton and try to optimize it.
> > > > 
> > > > Hi Yanmin,
> > > > 
> > > > Would it be possible for you to switch of the group scheduling feature
> > > > and see if the regression still exists. In all our testing, we did not
> > > > see a regression. I would like to eliminate it from your testing as
> > > > well.
> > > I tested with CONFIG_GROUP_SCHED=n. To test faster, I simplified the benchmark parameter.
> > > 
> > > volanoMark:
> > > kernel				| 	result
> > > ----------------------------------------------------------
> > > 2.6.27-rc1_group		|	205901
> > > ----------------------------------------------------------
> > > 2.6.27-rc1_nogroup		|	303377
> > > ----------------------------------------------------------
> > > 2.6.26_group			|	529388
> > > 
> > 
> > There seem to be two different regressions here. One in the user group
> > scheduling (which I do remember did have problems) and something totally
> > unrelated to group scheduling. In some of the runs I tried here, I got
> > similar results for 2.6.27-rc1_nogroup and 2.6.27-rc1_cgroup
> Does cgroup here mean CONFIG_CGROUPS? Or just a typo?
> 

It means CONFIG_CGROUP_SCHED.

> I never enable CONFIG_CGROUP.
> 
> >  but had bad
> > results for user. Anyway, we will need to fix both the regressions.
> That's great.
> 
> > Would it be possible for you to see what causes the regression between
> > 2.6.26 and 2.6.27-rc1 for the non group scheduling case?
> I will check it. But git bisect doesn't work on this issue. Mostly, it's still
> caused by scheduler. If checking the old emails about 2.6.26-rc1, we can find the
> major issues about scheduler are related to 2 patches, although I'm not sure
> current regression is still caused by them.
> 

The current set of patches affect group scheduling. From your results,
there is a big performance regression between the 2.6.26 group
scheduling and 2.6.27-rc1 non group scheduling case (where normally non
group scheduling case should have performed better). (I don't recall any
major changes to the scheduler which would explain this regression).
Peter, vatsa, any ideas?

Thanks,
-- 
regards,
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ