lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YBHrH1Rw/nQqgDK4@hirez.programming.kicks-ass.net>
Date:   Wed, 27 Jan 2021 23:37:19 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alexander A Sverdlin <alexander.sverdlin@...ia.com>
Cc:     Paul Burton <paul.burton@...tec.com>, linux-mips@...r.kernel.org,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 6/6] MIPS: cmpxchg: Use cmpxchg_local() for
 {cmp_}xchg_small()

On Wed, Jan 27, 2021 at 09:36:27PM +0100, Alexander A Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@...ia.com>
> 
> It makes no sense to fold smp_mb__before_llsc()/smp_llsc_mb() again and
> again, leave only one barrier pair in the outer function.
> 
> This removes one SYNCW from __xchg_small() and brings around 10%
> performance improvement in a tight spinlock loop with 6 threads on a 6 core
> Octeon.
> 
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@...ia.com>
> ---
>  arch/mips/kernel/cmpxchg.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/mips/kernel/cmpxchg.c b/arch/mips/kernel/cmpxchg.c
> index 89107de..122e85f 100644
> --- a/arch/mips/kernel/cmpxchg.c
> +++ b/arch/mips/kernel/cmpxchg.c
> @@ -41,7 +41,7 @@ unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int s
>  	do {
>  		old32 = load32;
>  		new32 = (load32 & ~mask) | (val << shift);
> -		load32 = cmpxchg(ptr32, old32, new32);
> +		load32 = cmpxchg_local(ptr32, old32, new32);
>  	} while (load32 != old32);
>  
>  	return (load32 & mask) >> shift;
> @@ -97,7 +97,7 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
>  		 */
>  		old32 = (load32 & ~mask) | (old << shift);
>  		new32 = (load32 & ~mask) | (new << shift);
> -		load32 = cmpxchg(ptr32, old32, new32);
> +		load32 = cmpxchg_local(ptr32, old32, new32);
>  		if (load32 == old32)
>  			return old;
>  	}

This is wrong, please use cmpxchg_relaxed() which you've just
introduced. cmpxchg_local() need not be cross-cpu atomic at all (it is
on mips by accident of implementation).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ