lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d35ea927d520569ac8b7482f5fedbe916005ff83.camel@xry111.site>
Date: Tue, 25 Nov 2025 11:04:03 +0800
From: Xi Ruoyao <xry111@...111.site>
To: George Guo <dongtai.guo@...ux.dev>, hev <r@....cc>
Cc: Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>, 
	loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org, George Guo
	 <guodongtai@...inos.cn>
Subject: Re: [PATCH v2 1/2] LoongArch: Add 128-bit atomic cmpxchg support

On Tue, 2025-11-25 at 10:43 +0800, George Guo wrote:
> > > +         "=ZB" (__ptr[0])
> > >      \
> > 
> > "ZB" isn't a legal constraint for the address operand in sc.q. When
> > assembled, it turns into something like sc.q $r,$r,$r,0, which clearly
> > doesn't match the instruction format, yet gas happily accepts it wheil
> > clang rightfully rejects it. Classic GNU-as leniency biting again. :)

I clearly remember when Jiajie submitted the sc.q support to GAS
Qinggang was really insistent on supporting the additional ",0" here. 
But I don't really understand why we must support it...
> 
> Thanks for your advice, I tried sc.q with r or ZC. the result as
> below: (with gcc 14.2.1 in fedora-42)
>    - sc.q with "r"  caused system hang

It won't work because it'll pass the value (not address) of __ptr[0].

>    - sc.q with "ZC" caused compiler error:
>      {standard input}: Assembler messages:
>      {standard input}:10037: Fatal error: Immediate overflow.

It won't work because the only accepted immediate of sc.q is 0, but ZC
would allow any factor of 4 in [-32768, 32768).  I.e. ZC is for
{ldptr,stptr,ll,sc}.{w,d}.

As ZB is only used for sc.q (yet) in GCC backend maybe we can change ZB
to print simply $rX instead of $rX,0 and make LLVM do the same.  Would
someone submit a GCC patch for that?  Or is there already such a
constraint but I don't know?

BTW for the barrier between ll.d and ld.d, "dbar 0x700" is enough to
order two loads on the same address, and a Loongson hardware engineer
just confirmed me privately that "same address" can be read as "in the
same cacheline" here.  Thus it's enough in our case and it has a lower
overhead than "dbar 0".

-- 
Xi Ruoyao <xry111@...111.site>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ