lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFULd4aDORSrq7zf_LcAZRP8HOHcrq2-rGMaroKyG2zQDHNpOA@mail.gmail.com>
Date:   Mon, 23 Jan 2023 16:04:43 +0100
From:   Uros Bizjak <ubizjak@...il.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mateusz Guzik <mjguzik@...il.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lib/genalloc: use try_cmpxchg in {set,clear}_bits_ll

On Thu, Jan 19, 2023 at 1:47 PM David Laight <David.Laight@...lab.com> wrote:
>
> > BTW: Recently, it was determined [1] that the usage of cpu_relax()
> > inside the cmpxchg loop can be harmful for performance. We actually
> > have the same situation here, so perhaps cpu_relax() should be removed
> > in the same way it was removed from the lockref.
>
> I'm not sure you can ever want a cpu_relax() in a loop that
> is implementing an atomic operation.
> Even the ia64 (die...) issue was with a loop that was waiting
> for another cpu to change the location (eg a spinlock).
>
> For an atomic operation an immediate retry is likely to succeed.
> Any kind of deferral to an another cpu can only make it worse.
>
> Clearly if you have 100s of cpu looping doing atomic operation
> on the same cache line it is likely that some get starved.
> But to fix that you need to increase the time between successful
> operations, not delay on failure.

I would like to point out that the wikipedia article on
compare-and-swap claims [1] that:

Instead of immediately retrying after a CAS operation fails,
researchers have found that total system performance can be improved
in multiprocessor systems—where many threads constantly update some
particular shared variable—if threads that see their CAS fail use
exponential backoff—in other words, wait a little before retrying the
CAS [2].

[1] https://en.wikipedia.org/wiki/Compare-and-swap#Overview
[2] https://arxiv.org/pdf/1305.5800.pdf

Uros.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ