lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <50F071F1.6090600@redhat.com> Date: Fri, 11 Jan 2013 15:11:29 -0500 From: Rik van Riel <riel@...hat.com> To: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com> CC: Rafael Aquini <aquini@...hat.com>, linux-kernel@...r.kernel.org, walken@...gle.com, eric.dumazet@...il.com, lwoodman@...hat.com, jeremy@...p.org, Jan Beulich <JBeulich@...ell.com>, knoel@...hat.com, chegu_vinod@...com, mingo@...hat.com Subject: Re: [PATCH 0/5] x86,smp: make ticket spinlock proportional backoff w/ auto tuning On 01/10/2013 12:36 PM, Raghavendra K T wrote: > * Rafael Aquini <aquini@...hat.com> [2013-01-10 00:27:23]: > >> On Wed, Jan 09, 2013 at 06:20:35PM +0530, Raghavendra K T wrote: >>> I ran kernbench on 32 core (mx3850) machine with 3.8-rc2 base. >>> x base_3.8rc2 >>> + rik_backoff >>> N Min Max Median Avg Stddev >>> x 8 222.977 231.16 227.735 227.388 3.1512986 >>> + 8 218.75 232.347 229.1035 228.25425 4.2730225 >>> No difference proven at 95.0% confidence >> >> I got similar results on smaller systems (1 socket, dual-cores and quad-cores) >> when running Rik's latest series, no big difference for good nor for worse, >> but I also think Rik's work is meant to address bigger systems with more cores >> contending for any given spinlock. > > I was able to do the test on same 32 core machine with > 4 guests (8GB RAM, 32 vcpu). > Here are the results > > base = 3.8-rc2 > patched = base + Rik V3 backoff series [patch 1-4] I believe I understand why this is happening. Modern Intel and AMD CPUs have a feature called Pause Loop Exiting (PLE) and Pause Filter (PF), respectively. This feature is used to trap to the host when the guest is spinning on a spinlock. This allows the host to run something else, and having the spinner temporarily yield the CPU. Effectively, this causes the KVM code to already do some limited amount of spinlock backoff code, in the host. Adding more backoff code in the guest can lead to wild delays in acquiring locks, and generally bad performance. I suspect that when running in a virtual machine, we should limit the delay factor to something much smaller, since the host will take care of most of the backoff for us. Maybe a maximum delay value of ~10 would do the trick for KVM guests. We should be able to get this right by placing the value for the maximum delay in a __read_mostly section and setting it to something small from an init function when we detect we are running in a virtual machine. Let me cook up, and test, a patch that does that... -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists