[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHirt9hsN9cy16TKSn7Bb+HG5M52FR1Ct8=7xDiM14+5K_S8eg@mail.gmail.com>
Date: Fri, 30 Jul 2021 00:24:14 +0800
From: hev <r@....cc>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Will Deacon <will@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
linux-kernel@...r.kernel.org, stern@...land.harvard.edu,
parri.andrea@...il.com, npiggin@...il.com, dhowells@...hat.com,
j.alglave@....ac.uk, luc.maranget@...ia.fr, paulmck@...nel.org,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
huacai chen <chenhuacai@...il.com>,
Guo Ren <guoren@...nel.org>, geert@...ux-m68k.org,
Huacai Chen <chenhuacai@...ngson.cn>,
Ingo Molnar <mingo@...hat.com>, Arnd Bergmann <arnd@...db.de>,
wangrui <wangrui@...ngson.cn>, lixuefeng <lixuefeng@...ngson.cn>,
Jiaxun Yang <jiaxun.yang@...goat.com>
Subject: Re: [PATCH] Documentation/atomic_t: Document forward progress expectations
Hi, Peter,
On Thu, Jul 29, 2021 at 10:40 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
>
> Add a few words on forward progress; there's been quite a bit of
> confusion on the subject.
>
> Specifically, more complex locking primitives (ticket/qspinlock) require
> forward progress from their consituent operations in order to provide
> better/more guarantees than TaS locks.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Acked-by: Will Deacon <will@...nel.org>
> Acked-by: Boqun Feng <boqun.feng@...il.com>
> ---
> --- a/Documentation/atomic_t.txt
> +++ b/Documentation/atomic_t.txt
> @@ -312,3 +312,56 @@ Both provide the same functionality, but
>
> NB. try_cmpxchg() also generates better code on some platforms (notably x86)
> where the function more closely matches the hardware instruction.
> +
> +
> +FORWARD PROGRESS
> +----------------
> +
> +In general strong forward progress is expected of all unconditional atomic
> +operations -- those in the Arithmetic and Bitwise classes and xchg(). However
> +a fair amount of code also requires forward progress from the conditional
> +atomic operations.
> +
> +Specifically 'simple' cmpxchg() loops are expected to not starve one another
> +indefinitely. However, this is not evident on LL/SC architectures, because
> +while an LL/SC architecure 'can/should/must' provide forward progress
> +guarantees between competing LL/SC sections, such a guarantee does not
> +transfer to cmpxchg() implemented using LL/SC. Consider:
Thanks for your explanation.
> +
> + old = atomic_read(&v);
> + do {
> + new = func(old);
> + } while (!atomic_try_cmpxchg(&v, &old, new));
We may need new APIs to help LL/SC to implement atomic operations, but
this is obviously incompatible with native CAS. and many and many
common functions are CAS friendly. Let's more functions that implement
atomic semantics can be overridden by architecture may be a way. ;-)
In the above example, the correct implementation on LL/SC may be like:
do {
old = LL(&v);
new = func(old, &skip);
if (skip) {
break;
}
} while (!SC(&v, new);
However, the success of SC may be affected by the inconstant
complexity of func. :-(
Regards,
Rui
> +
> +which on LL/SC becomes something like:
> +
> + old = atomic_read(&v);
> + do {
> + new = func(old);
> + } while (!({
> + volatile asm ("1: LL %[oldval], %[v]\n"
> + " CMP %[oldval], %[old]\n"
> + " BNE 2f\n"
> + " SC %[new], %[v]\n"
> + " BNE 1b\n"
> + "2:\n"
> + : [oldval] "=&r" (oldval), [v] "m" (v)
> + : [old] "r" (old), [new] "r" (new)
> + : "memory");
> + success = (oldval == old);
> + if (!success)
> + old = oldval;
> + success; }));
> +
> +However, even the forward branch from the failed compare can cause the LL/SC
> +to fail on some architectures, let alone whatever the compiler makes of the C
> +loop body. As a result there is no guarantee what so ever the cacheline
> +containing @v will stay on the local CPU and progress is made.
> +
> +Even native CAS architectures can fail to provide forward progress for their
> +primitive (See Sparc64 for an example).
> +
> +Such implementations are strongly encouraged to add exponential backoff loops
> +to a failed CAS in order to ensure some progress. Affected architectures are
> +also strongly encouraged to inspect/audit the atomic fallbacks, refcount_t and
> +their locking primitives.
Powered by blists - more mailing lists