linux-kernel - Re: [4.2, Regression] Queued spinlocks cause major XFS performance regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150904151427.GG18489@twins.programming.kicks-ass.net>
Date:	Fri, 4 Sep 2015 17:14:27 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Dave Chinner <david@...morbit.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Waiman Long <Waiman.Long@...com>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [4.2, Regression] Queued spinlocks cause major XFS performance
 regression

On Fri, Sep 04, 2015 at 08:05:16AM -0700, Linus Torvalds wrote:
> So at the very *minimum*, that second issue should be fixed, and the
> loop in virt_queued_spin_lock() should look something like
> 
>     do {
>         while (READ_ONCE(lock->val) != 0)
>             cpu_relax();
>     } while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0);
> 
> which at least has a chance in hell of behaving well on the bus and in
> a HT environment.

True.

> But I suspect that it would be even better for Dave to just disable
> the whole thing, and see how the queued locks actually work. Dave, can
> you turn that virt_queued_spin_lock() into just "return false"? In
> fact, I would almost _insist_ we do this when CONFIG_PARAVIRT_SPINLOCK
> isn't set, isn't that what our old ticket-spinlocks did? They didn't
> screw up and degrade to a test-and-set lock just because they saw a
> hypervisor - that only happened when things were paravirt-aware. No?

The reason we chose to revert to a test-and-set is because regular fair
locks, like the ticket and the queue thing, have horrible behaviour
under vcpu preemption.

> Dave, if you have the energy, try it both ways. But the code as-is for
> "I'm running in a hypervisor" looks just terminally broken. People who
> didn't run in hypervisors just never saw the breakage.

He did, it mostly restores performance, but was quite erratic. Lock
holder preemption problems get much worse with strict queueing. So even
though he's typically not overloaded, any vcpu preemption can ripple
through and create noise.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/