[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1302071122440.4275@kaball.uk.xensource.com>
Date: Thu, 7 Feb 2013 11:25:51 +0000
From: Stefano Stabellini <stefano.stabellini@...citrix.com>
To: Rik van Riel <riel@...hat.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"aquini@...hat.com" <aquini@...hat.com>,
"eric.dumazet@...il.com" <eric.dumazet@...il.com>,
"lwoodman@...hat.com" <lwoodman@...hat.com>,
"knoel@...hat.com" <knoel@...hat.com>,
"chegu_vinod@...com" <chegu_vinod@...com>,
"raghavendra.kt@...ux.vnet.ibm.com"
<raghavendra.kt@...ux.vnet.ibm.com>,
"mingo@...hat.com" <mingo@...hat.com>
Subject: Re: [PATCH -v5 5/5] x86,smp: limit spinlock delay on virtual
machines
On Wed, 6 Feb 2013, Rik van Riel wrote:
> Modern Intel and AMD CPUs will trap to the host when the guest
> is spinning on a spinlock, allowing the host to schedule in
> something else.
>
> This effectively means the host is taking care of spinlock
> backoff for virtual machines. It also means that doing the
> spinlock backoff in the guest anyway can lead to totally
> unpredictable results, extremely large backoffs, and
> performance regressions.
>
> To prevent those problems, we limit the spinlock backoff
> delay, when running in a virtual machine, to a small value.
>
> Signed-off-by: Rik van Riel <riel@...hat.com>
> ---
> arch/x86/include/asm/processor.h | 2 ++
> arch/x86/kernel/cpu/hypervisor.c | 2 ++
> arch/x86/kernel/smp.c | 21 +++++++++++++++++++--
> 3 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 888184b..4118fd8 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -997,6 +997,8 @@ extern bool cpu_has_amd_erratum(const int *);
> extern unsigned long arch_align_stack(unsigned long sp);
> extern void free_init_pages(char *what, unsigned long begin, unsigned long end);
>
> +extern void init_guest_spinlock_delay(void);
> +
> void default_idle(void);
> bool set_pm_idle_to_default(void);
>
> diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
> index a8f8fa9..4a53724 100644
> --- a/arch/x86/kernel/cpu/hypervisor.c
> +++ b/arch/x86/kernel/cpu/hypervisor.c
> @@ -76,6 +76,8 @@ void __init init_hypervisor_platform(void)
>
> init_hypervisor(&boot_cpu_data);
>
> + init_guest_spinlock_delay();
> +
> if (x86_hyper->init_platform)
> x86_hyper->init_platform();
> }
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 64e33ef..fbc5ff3 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -116,8 +116,25 @@ static bool smp_no_nmi_ipi = false;
> #define DELAY_SHIFT 8
> #define DELAY_FIXED_1 (1<<DELAY_SHIFT)
> #define MIN_SPINLOCK_DELAY (1 * DELAY_FIXED_1)
> -#define MAX_SPINLOCK_DELAY (16000 * DELAY_FIXED_1)
> +#define MAX_SPINLOCK_DELAY_NATIVE (16000 * DELAY_FIXED_1)
> +#define MAX_SPINLOCK_DELAY_GUEST (16 * DELAY_FIXED_1)
> #define DELAY_HASH_SHIFT 6
> +
> +/*
> + * Modern Intel and AMD CPUs tell the hypervisor when a guest is
> + * spinning excessively on a spinlock. The hypervisor will then
> + * schedule something else, effectively taking care of the backoff
> + * for us. Doing our own backoff on top of the hypervisor's pause
> + * loop exit handling can lead to excessively long delays, and
> + * performance degradations. Limit the spinlock delay in virtual
> + * machines to a smaller value. Called from init_hypervisor_platform
> + */
> +static int __read_mostly max_spinlock_delay = MAX_SPINLOCK_DELAY_NATIVE;
> +void __init init_guest_spinlock_delay(void)
> +{
> + max_spinlock_delay = MAX_SPINLOCK_DELAY_GUEST;
> +}
> +
Same comment as last time:
"""
Before reducing max_spinlock_delay, shouldn't we check that PAUSE-loop
exiting is available? What if we are running on an older x86 machine
that doesn't support it?
"""
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists