linux-kernel - Re: [PATCH 00/27] Latest numa/core release, v16

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121120102010.GP8218@suse.de>
Date:	Tue, 20 Nov 2012 10:20:10 +0000
From:	Mel Gorman <mgorman@...e.de>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Turner <pjt@...gle.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Johannes Weiner <hannes@...xchg.org>,
	Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 00/27] Latest numa/core release, v16

> > Ingo, stop doing this kind of crap.
> > 
> > Let's make it clear: if the NUMA patches continue to regress 
> > performance for reasonable loads (and that very much includes 
> > "no THP") then they won't be merged.
> > 
> > You seem to be in total denial. Every time Mel sends out 
> > results that show that your patches MAKE PERFORMANCE WORSE you 
> > blame Mel, or blame the load, and never seem to admit that 
> > performance got worse.
> 
> No doubt numa/core should not regress with THP off or on and 
> I'll fix that.
> 
> As a background, here's how SPECjbb gets slower on mainline 
> (v3.7-rc6) if you boot Mel's kernel config and turn THP forcibly
> off:
> 
>   (avg: 502395 ops/sec)
>   (avg: 505902 ops/sec)
>   (avg: 509271 ops/sec)
> 
>   # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> 
>   (avg: 376989 ops/sec)
>   (avg: 379463 ops/sec)
>   (avg: 378131 ops/sec)
> 
> A ~30% slowdown.
> 
> [ How do I know? I asked for Mel's kernel config days ago and
>   actually booted Mel's very config in the past few days, 
>   spending hours on testing it on 4 separate NUMA systems, 
>   trying to find Mel's regression. In the past Mel was a 
>   reliable tester so I blindly trusted his results. Was that 
>   some weird sort of denial on my part? :-) ]
> 
> Every time a regression is reported I take it seriously - and 
> there were performance regression reports against numa/core not 
> just from Mel and I'm sure there will be more in the future. For 
> example I'm taking David Rijentje's fresh performance regression 
> report seriously as well.
> 
> What I have some problem with is Mel sending me his kernel 
> config as the thing he tested, and which config included:
> 
>   CONFIG_TRANSPARENT_HUGEPAGE=y
>   CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> 
> but he apparently went and explicitly disabled THP on top of 
> that - which was just a weird choice of 'negative test tuning' 
> to keep unreported.

I've already apologised for this omission and I'll apologise again if
necessary. The whole point of implementing MM Tests was to mitigate exactly
this sort of situation where the testing conditions are suspicious so the
configuration file and scripts can be rechecked. I was assuming the lack
of THP usage was due to the JVM using shared memory because a few years ago
the JVM I was using at the time did this if configured for hugepage usage.
Assumptions made a complete ass out of me here. When I did the recheck of
what the JVM was actually doing and reexamined the configuration, I saw
the mistake and admitted it immediately.

I want to be absolutely clear that it was unintentional to disable THP like
this which is why I did not report it. I did not deliberately hide such
information because that would be completely unacceptable. The root of the
mistake is actually from a few years ago where tests would be configured
to run with base pages, huge pages and transparent hugepages -- similar to
how we might currently test vanilla kernel, hard bindings and automatic
migration. Because of their history, some mmtest scripts support running
with multiple page sizes and I neglected to properly identify this and
retrofit a "default hugepage" configuration.

I've also been already clear that this was done for *all* the specjbb
tests. It was still a mistake but it was evenly applied.

I've added two extra configuration files to run specjbb single and multi
JVMs with THP enabled. It takes about 1.5 to 2 hours to complete a single
test which means a full battery of tests for autonuma, vanilla kernel and
schednuma will take a little over 24 hours (4 specjbb tests, autobench and
a few basic performance tests like kernbench, hackbench etc). They will
not be running back-to-back as the machine is not dedicated to this. I'll
report when they're done.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/