linux-kernel - Re: [PATCH v2 cmpxchg 12/13] sh: Emulate one-byte cmpxchg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240502205345.GK2118490@ZenIV>
Date: Thu, 2 May 2024 21:53:45 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
	elver@...gle.com, akpm@...ux-foundation.org, tglx@...utronix.de,
	peterz@...radead.org, dianders@...omium.org, pmladek@...e.com,
	arnd@...db.de, torvalds@...ux-foundation.org, kernel-team@...a.com,
	Andi Shyti <andi.shyti@...ux.intel.com>,
	Palmer Dabbelt <palmer@...osinc.com>,
	Masami Hiramatsu <mhiramat@...nel.org>, linux-sh@...r.kernel.org
Subject: Re: [PATCH v2 cmpxchg 12/13] sh: Emulate one-byte cmpxchg

On Thu, May 02, 2024 at 06:33:49AM -0700, Paul E. McKenney wrote:

> Understood, and this sort of compatibility consideration is why this
> version of this patchset does not emulate two-byte (16-bit) cmpxchg()
> operations.  The original (RFC) series did emulate these, which does
> not work on a few architectures that do not provide 16-bit load/store
> instructions, hence no 16-bit support in this series.
> 
> So this one-byte-only series affects only Alpha systems lacking
> single-byte load/store instructions.  If I understand correctly, Alpha
> 21164A (EV56) and later *do* have single-byte load/store instructions,
> and thus are still just fine.  In fact, it looks like EV56 also has
> two-byte load/store instructions, and so would have been OK with
> the original one-/two-byte RFC series.

Wait a sec.  On Alpha we already implement 16bit and 8bit xchg and cmpxchg.
See arch/alpha/include/asm/xchg.h:
static inline unsigned long
____cmpxchg(_u16, volatile short *m, unsigned short old, unsigned short new)
{
       unsigned long prev, tmp, cmp, addr64;

       __asm__ __volatile__(
       "       andnot  %5,7,%4\n"
       "       inswl   %1,%5,%1\n"
       "1:     ldq_l   %2,0(%4)\n"
       "       extwl   %2,%5,%0\n"
       "       cmpeq   %0,%6,%3\n"
       "       beq     %3,2f\n"
       "       mskwl   %2,%5,%2\n"
       "       or      %1,%2,%2\n"
       "       stq_c   %2,0(%4)\n"
       "       beq     %2,3f\n"
       "2:\n"
       ".subsection 2\n"
       "3:     br      1b\n"
       ".previous"
       : "=&r" (prev), "=&r" (new), "=&r" (tmp), "=&r" (cmp), "=&r" (addr64)
       : "r" ((long)m), "Ir" (old), "1" (new) : "memory");

       return prev;
}

Load-locked and store-conditional are done on 64bit value, with
16bit operations done in registers.  This is what 16bit store
(assignment to unsigned short *) turns into with
        stw $17,0($16)		// *(u16*)r16 = r17
and without -mbwx
        insql $17,$16,$17	// r17 = r17 << (8 * (r16 & 7))
        ldq_u $1,0($16)		// r1 = *(u64 *)(r16 & ~7)
	mskwl $1,$16,$1		// r1 &= ~(0xffff << (8 * (r16 & 7))
	bis $17,$1,$17		// r17 |= r1
	stq_u $17,0($16)	// *(u64 *)(r16 & ~7) = r17

What's more, load-locked/store-conditional doesn't have 16bit and 8bit
variants on any Alphas - it's always 32bit (ldl_l) or 64bit (ldq_l).

What BWX adds is load/store byte/word, load/store byte/word unaligned
and sign-extend byte/word.  IOW, it's absolutely irrelevant for
cmpxchg (or xchg) purposes.