lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Aug 2016 17:45:17 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Dave Chinner <david@...morbit.com>
Cc:	Bob Peterson <rpeterso@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	"Huang, Ying" <ying.huang@...el.com>,
	Christoph Hellwig <hch@....de>,
	Wu Fengguang <fengguang.wu@...el.com>, LKP <lkp@...org>,
	Tejun Heo <tj@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

On Mon, Aug 15, 2016 at 5:17 PM, Dave Chinner <david@...morbit.com> wrote:
>
> Read the code, Linus?

I am. It's how I came up with my current pet theory.

But I don't actually have enough sane numbers to make it much more
than a cute pet theory. It *might* explain why you see tons of kswap
time and bad lock contention where it didn't use to exist, but ..

I can't recreate the problem, and your old profiles were bad enough
that they aren't really worth looking at.

> Except they *aren't broken*. They are simply *less accurate* than
> they could be.

They are so much less accurate that quite frankly, there's no point in
looking at them outside of "there is contention on the lock".

And considering that the numbers didn't even change when you had
spinlock debugging on, it's not the lock itself that causes this, I'm
pretty sure.

Because when you have normal contention due to the *locking* itself
being the problem, it tends to absolutely _explode_ with the debugging
spinlocks, because the lock itself becomes much more expensive.
Usually super-linearly.

But that wasn't the case here. The numbers stayed constant.

So yeah, I started looking at bigger behavioral issues, which is why I
zeroed in on that zone-vs-node change. But it might be a completely
broken theory. For example, if you still have the contention when
running plain 4.7, that theory was clearly complete BS.

And this is where "less accurate" means that they are almost entirely useless.

More detail needed. It might not be in the profiles themselves, of
course. There might be other much more informative sources if you can
come up with anything...

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ