[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <461f162c-694f-2bb7-f9cb-55fa915434bc@redhat.com>
Date: Mon, 4 Mar 2019 09:33:16 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Fenghua Yu <fenghua.yu@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Dave Hansen <dave.hansen@...el.com>,
Ashok Raj <ashok.raj@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
Xiaoyao Li <xiaoyao.li@...el.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
kvm@...r.kernel.org
Subject: Re: [PATCH v4 01/17] x86/common: Align cpu_caps_cleared and
cpu_caps_set to unsigned long
On 02/03/19 03:44, Fenghua Yu wrote:
> cpu_caps_cleared and cpu_caps_set may not be aligned to unsigned long.
> Atomic operations (i.e. set_bit and clear_bit) on the bitmaps may access
> two cache lines (a.k.a. split lock) and lock bus to block all memory
> accesses from other processors to ensure atomicity.
>
> To avoid the overall performance degradation from the bus locking, align
> the two variables to unsigned long.
>
> Defining the variables as unsigned long may also fix the issue because
> they are naturally aligned to unsigned long. But that needs additional
> code changes. Adding __aligned(unsigned long) are simpler fixes.
>
> Signed-off-by: Fenghua Yu <fenghua.yu@...el.com>
> ---
> arch/x86/kernel/cpu/common.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index cb28e98a0659..51ab37ba5f64 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -488,8 +488,9 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
> return NULL; /* Not found */
> }
>
> -__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
> -__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
> +/* Unsigned long alignment to avoid split lock in atomic bitmap ops */
> +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
> +__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
>
> void load_percpu_segment(int cpu)
> {
>
(resending including the list)
Why not instead change set_bit/clear_bit to use btsl/btrl instead of
btsq/btrq?
Thanks,
Paolo
Powered by blists - more mailing lists