lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Feb 2023 09:49:07 +0100
From:   Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To:     paulmck@...nel.org
Cc:     Andrea Parri <parri.andrea@...il.com>,
        Alan Stern <stern@...land.harvard.edu>,
        Jonas Oberhauser <jonas.oberhauser@...wei.com>,
        will@...nel.org, peterz@...radead.org, boqun.feng@...il.com,
        npiggin@...il.com, dhowells@...hat.com, j.alglave@....ac.uk,
        luc.maranget@...ia.fr, akiyks@...il.com, dlustig@...dia.com,
        joel@...lfernandes.org, urezki@...il.com, quic_neeraju@...cinc.com,
        frederic@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tools/memory-model: Make ppo a subrelation of po



On 2/27/2023 11:21 PM, Paul E. McKenney wrote:
> On Mon, Feb 27, 2023 at 09:13:01PM +0100, Jonas Oberhauser wrote:
>>
>> On 2/27/2023 8:40 PM, Andrea Parri wrote:
>>>> The LKMM doesn't believe that a control or data dependency orders a
>>>> plain write after a marked read.  Hence in this test it thinks that P1's
>>>> store to u0 can happen before the load of x1.  I don't remember why we
>>>> did it this way -- probably we just wanted to minimize the restrictions
>>>> on when plain accesses can execute.  (I do remember the reason for
>>>> making address dependencies induce order; it was so RCU would work.)
>>>>
>>>> The patch below will change what the LKMM believes.  It eliminates the
>>>> positive outcome of the litmus test and the data race.  Should it be
>>>> adopted into the memory model?
>>> (Unpopular opinion I know,) it should drop dependencies ordering, not
>>> add/promote it.
>>>
>>>     Andrea
>> Maybe not as unpopular as you think... :)
>> But either way IMHO it should be consistent; either take all the
>> dependencies that are true and add them, or drop them all.
>> In the latter case, RCU should change to an acquire barrier. (also, one
>> would have to deal with OOTA in some yet different way).
>>
>> Generally my position is that unless there's a real-world benchmark with
>> proven performance benefits of relying on dependency ordering, one should
>> use an acquire barrier. I haven't yet met such a case, but maybe one of you
>> has...
> https://www.msully.net/thesis/thesis.pdf page 128 (PDF page 141).
>
> Though this is admittedly for ARMv7 and PowerPC.
>

Thanks for the link.

It's true that on architectures that don't have an acquire load (and 
have to use a fence), the penalty will be bigger.

But the more obvious discussion would be what constitutes a real-world 
benchmark : )
In my experience you can get a lot of performance benefits out of 
optimizing barriers in code if all you execute is that code.
But once you embed that into a real-world application, often 90%-99% of 
time spent will be in the business logic, not in the data structure.

And then the benefits suddenly disappear.
Note that a lot of barriers are a lot cheaper as well when there's no 
contention.

Because of that, making optimization decisions based on microbenchmarks 
can sometimes lead to a very poor "time invested" vs "total product 
improvement" ratio.

Best wishes,
jonas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ