linux-kernel - Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1347878845.6955.203.camel@marge.simpson.net>
Date:	Mon, 17 Sep 2012 12:47:25 +0200
From:	Mike Galbraith <efault@....de>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Andi Kleen <andi@...stfloor.org>,
	Borislav Petkov <bp@...en8.de>,
	Nikolay Ulyanitsky <lystor@...il.com>,
	linux-kernel@...r.kernel.org,
	Andreas Herrmann <andreas.herrmann3@....com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to
 3.6-rc5 on AMD chipsets - bisected

On Mon, 2012-09-17 at 12:07 +0200, Ingo Molnar wrote: 
> * Mike Galbraith <efault@....de> wrote:
> 

> >     4 socket 40 core + SMT Westmere box, single 30 sec tbench runs, higher is better:
> >     
> >      clients     1       2       4        8       16       32       64      128
> >      ..........................................................................
> >      pre        30      41     118      645     3769     6214    12233    14312
> >      post      299     603    1211     2418     4697     6847    11606    14557
> 
> That's a very tempting speedup for a simpler and more 
> fundamental workload than postgresql's somewhat weird
> user-space spinlocks that burn CPU time in user-space
> instead of blocking/waiting on a futex.
> 
> IIRC mysql does this properly and outperforms postgresql
> on this benchmark, in an apples-to-apples configuration?

It's been a while since I fiddled with oltp (lost my fast mysql db,
every attempt to re-create produced a complete slug), but postgress was
always the throughput winner at that here.

> > 10x at 1 pair shouldn't be traversal, the whole box is 
> > otherwise idle. We'll do a lot more (ever more futile) 
> > traversal as load increases, but at the same time, our futile 
> > attempts fail more frequently, so we shoot ourselves in the 
> > foot less frequently.
> > 
> > The down side is (appears to be) that I also shut down some 
> > ~odd case preemption salvation, salvation that only large 
> > packages will receive.
> > 
> > The problem as I see it is that we're making light tasks _too_ 
> > mobile, turning an optimization into a pessimization for light 
> > tasks.  For longer running tasks this mobility within a large 
> > package isn't such a big deal, but for fast movers, it hurts a 
> > lot.
> 
> There's not enough time to resolve this for v3.6, so I agree 
> with the revert - would you be willing to post a v2 of your 
> original patch? I really think we want your tbench speedups, 
> quite a few real-world messaging applications use the tbench 
> patterns of scheduling.

I don't know what a v2 would look like, but I can keep thinking about
this irritating little <naughty words elided>.  Peter's a lot hairier
chested, not to mention having a sense of _taste_ :) so it might be
better to just consider my patch a diagnostic, and let him fix it up in
a (likely lots) less tummy distressing manner.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/