lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <4106601E-82BC-471D-8AD0-B5E8FE99C7CD@gmail.com>
Date: Sat, 28 Sep 2024 00:06:06 +0800
From: Alan Huang <mmpgouride@...il.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Jonas Oberhauser <jonas.oberhauser@...weicloud.com>,
 LKML <linux-kernel@...r.kernel.org>,
 RCU <rcu@...r.kernel.org>,
 linux-mm@...ck.org,
 lkmm@...ts.linux.dev,
 "Paul E. McKenney" <paulmck@...nel.org>,
 Frederic Weisbecker <frederic@...nel.org>,
 Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
 Joel Fernandes <joel@...lfernandes.org>,
 Josh Triplett <josh@...htriplett.org>,
 "Uladzislau Rezki (Sony)" <urezki@...il.com>,
 rostedt <rostedt@...dmis.org>,
 Lai Jiangshan <jiangshanlai@...il.com>,
 Zqiang <qiang.zhang1211@...il.com>,
 Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will@...nel.org>,
 Waiman Long <longman@...hat.com>,
 Mark Rutland <mark.rutland@....com>,
 Thomas Gleixner <tglx@...utronix.de>,
 Kent Overstreet <kent.overstreet@...il.com>,
 Vlastimil Babka <vbabka@...e.cz>,
 maged.michael@...il.com,
 Neeraj upadhyay <neeraj.upadhyay@....com>
Subject: Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard
 pointers

2024年9月27日 12:28,Boqun Feng <boqun.feng@...il.com> wrote:
> 
> On Fri, Sep 27, 2024 at 09:37:50AM +0800, Boqun Feng wrote:
>> 
>> 
>> On Fri, Sep 27, 2024, at 9:30 AM, Mathieu Desnoyers wrote:
>>> On 2024-09-27 02:01, Boqun Feng wrote:
>>>> #define ADDRESS_EQ(var, expr) \
>>>> ({ \
>>>> bool _____cmp_res = (unsigned long)(var) == (unsigned long)(expr); \
>>>> \
>>>> OPTIMIZER_HIDE_VAR(var); \
>>>> _____cmp_res; \
>>>> })
>>> 
>>> If the goal is to ensure gcc uses the register populated by the
>>> second, I'm afraid it does not work. AFAIU, "hiding" the dependency
>>> chain does not prevent the SSA GVN optimization from combining the
> 
> Note it's not hiding the dependency, rather the equality,
> 
>>> registers as being one and choosing one arbitrary source. "hiding"
> 
> after OPTIMIZER_HIDE_VAR(var), compiler doesn't know whether 'var' is
> equal to 'expr' anymore, because OPTIMIZER_HIDE_VAR(var) uses "=r"(var)
> to indicate the output is overwritten. So when 'var' is referred later,
> compiler cannot use the register for a 'expr' value or any other
> register that has the same value, because 'var' may have a different
> value from the compiler's POV.
> 
>>> the dependency chain before or after the comparison won't help here.
>>> 
>>> int fct_hide_var_compare(void)
>>> {
>>>     int *a, *b;
>>> 
>>>     do {
>>>         a = READ_ONCE(p);
>>>         asm volatile ("" : : : "memory");
>>>         b = READ_ONCE(p);
>>>     } while (!ADDRESS_EQ(a, b));
>> 
>> Note that ADDRESS_EQ() only hide first parameter, so this should be ADDRESS_EQ(b, a).
>> 
> 
> I replaced ADDRESS_EQ(a, b) with ADDRESS_EQ(b, a), and the compile
> result shows it can prevent the issue:
> 
> gcc 14.2 x86-64:
> 
> fct_hide_var_compare:
> .L2:
>        mov     rcx, QWORD PTR p[rip]
>        mov     rdx, QWORD PTR p[rip]
>        mov     rax, rdx
>        cmp     rcx, rdx
>        jne     .L2
>        mov     eax, DWORD PTR [rax]
>        ret
> 
> gcc 14.2.0 ARM64:
> 
> fct_hide_var_compare:
>        adrp    x2, p
>        add     x2, x2, :lo12:p
> .L2:
>        ldr     x3, [x2]
>        ldr     x1, [x2]
>        mov     x0, x1
>        cmp     x3, x1
>        bne     .L2
>        ldr     w0, [x0]
>        ret
> 
> Link to godbolt:
> 
> https://godbolt.org/z/a7jsfzjxY

Checking the assembly generated by different compilers for the kernel on the local machine will yield more accurate results. Some optimizations are restricted by the kernel. Therefore, if you use Godbolt, ensure that the compiler arguments match those used for the kernel.

> 
> Regards,
> Boqun
> 
>> Regards,
>> Boqun
>> 
>>>     return *b;
>>> }
>>> 
>>> gcc 14.2 x86-64:
>>> 
>>> fct_hide_var_compare:
>>>  mov    rax,QWORD PTR [rip+0x0]        # 67 <fct_hide_var_compare+0x7>
>>>  mov    rdx,QWORD PTR [rip+0x0]        # 6e <fct_hide_var_compare+0xe>
>>>  cmp    rax,rdx
>>>  jne    60 <fct_hide_var_compare>
>>>  mov    eax,DWORD PTR [rax]
>>>  ret
>>> main:
>>>  xor    eax,eax
>>>  ret
>>> 
>>> gcc 14.2.0 ARM64:
>>> 
>>> fct_hide_var_compare:
>>>         adrp    x0, .LANCHOR0
>>>         add     x0, x0, :lo12:.LANCHOR0
>>> .L12:
>>>         ldr     x1, [x0]
>>>         ldr     x2, [x0]
>>>         cmp     x1, x2
>>>         bne     .L12
>>>         ldr     w0, [x1]
>>>         ret
>>> p:
>>>         .zero   8
>>> 
>>> 
>>> -- 
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> https://www.efficios.com



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ