linux-kernel - Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120927051815.GA1075@liondog.tnic>
Date:	Thu, 27 Sep 2012 07:18:15 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Mike Galbraith <efault@....de>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mel Gorman <mgorman@...e.de>,
	Nikolay Ulyanitsky <lystor@...il.com>,
	linux-kernel@...r.kernel.org,
	Andreas Herrmann <andreas.herrmann3@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	Suresh Siddha <suresh.b.siddha@...el.com>
Subject: Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to
 3.6-rc5 on AMD chipsets - bisected

On Thu, Sep 27, 2012 at 07:09:28AM +0200, Mike Galbraith wrote:
> > The way I understand it is, you either want to share L2 with a process,
> > because, for example, both working sets fit in the L2 and/or there's
> > some sharing which saves you moving everything over the L3. This is
> > where selecting a core on the same L2 is actually a good thing.
> 
> Yeah, and if the wakee can't get to the L2 hot data instantly, it may be
> better to let wakee drag the data to an instantly accessible spot.

Yep, then moving it to another L2 is the same.

[ … ]

> > A crazy thought: one could go and sample tasks while running their
> > timeslices with the perf counters to know exactly what type of workload
> > we're looking at. I.e., do I have a large number of L2 evictions? Yes,
> > then spread them out. No, then select the other core on the L2. And so
> > on.
> 
> Hm.  That sampling better be really cheap.  Might help...

Yeah, that's why I said sampling and not run the perfcounters during
every timeslice.

But if you count the proper events, you should be able to know exactly
what the workload is doing (compute-bound, io-bound, contention, etc...)

> but how does that affect pgbench and ilk that must spread regardless
> of footprints.

Well, how do you measure latency of the 1 process in the 1:N case? Maybe
pipeline stalls of the 1 along with some way to recognize it is the 1 in
the 1:N case.

Hmm.

-- 
Regards/Gruss,
    Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/