linux-kernel - Re: aim7 scalability issue on 4 socket machine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090918071217.GB17634@basil.fritz.box>
Date:	Fri, 18 Sep 2009 09:12:17 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Hugh Dickins <hugh.dickins@...cali.co.uk>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Andi Kleen <andi@...stfloor.org>,
	"lee.schermerhorn@...com" <lee.schermerhorn@...com>
Subject: Re: aim7 scalability issue on 4 socket machine

On Fri, Sep 18, 2009 at 07:53:58AM +0100, Hugh Dickins wrote:
> On Thu, 17 Sep 2009, Andrew Morton wrote:
> > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <yanmin_zhang@...ux.intel.com> wrote:
> > > > 
> > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > and let us know if that works as well for you - thanks.
> > > I tested Lee's patch and it does fix the issue.
> 
> Thanks for checking and reporting back, Yanmin.
> 
> > 
> > Do we think we should cook up something for -stable?
> 
> Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> is stable really for getting a better number out of a benchmark?

When your system is large enough scalability problems (e.g.
lock contention) can be a serious bug. i.e. when your workload
is 150% slower than expected that can well be a show stopper.

Admittedly the workload in this case was a benchmark, but it's
not that far fetched to expect the same problem in a real application.

We had a similar problem with the accounting lock some time 
ago, I think that patch also went in.

So yes I think simple non intrusive fixes for serious scalability
problems should be stable candidates.

> > Either this is a regression or the workload is particularly obscure. 
> 
> I've not cross-checked descriptions, but assume Lee was actually
> testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> that machine happens to have characteristics which show up badly here.

AFAIK Lee usually tests on large IA64 boxes.

> > aim7 is sufficiently non-obscure to make me wonder what's happened here?
> 
> Not a regression, just the onward march of new hardware, I think.
> Could easily be other such things in other places with other tests.

Yes, it's just a much larger machine, so old hidden scalability sins now
appear.

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/