linux-kernel - Re: [RFC][PATCH] spin loop arch primitives for busy waiting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFzvRiVx3dvXaQUpYwoHq0DnrV9RPXWnqSeb9b80DFosHA@mail.gmail.com>
Date:   Thu, 6 Apr 2017 12:41:52 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Will Deacon <will.deacon@....com>,
        Nicholas Piggin <npiggin@...il.com>,
        David Miller <davem@...emloft.net>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Anton Blanchard <anton@...ba.org>,
        linuxppc-dev list <linuxppc-dev@...abs.org>
Subject: Re: [RFC][PATCH] spin loop arch primitives for busy waiting

On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra <peterz@...radead.org> wrote:
>
> Something like so then. According to the SDM mwait is a no-op if we do
> not execute monitor first. So this variant should get the first
> iteration without expensive instructions.

No, the problem is that we *would* have executed a prior monitor that
could still be pending - from a previous invocation of
smp_cond_load_acquire().

Especially with spinlocks, these things can very much happen back-to-back.

And it would be pending with a different address (the previous
spinlock) that might not have changed since then (and might not be
changing), so now we might actually be pausing in mwait waiting for
that *other* thing to change.

So it would probably need to do something complicated like

  #define smp_cond_load_acquire(ptr, cond_expr)                         \
  ({                                                                    \
        typeof(ptr) __PTR = (ptr);                                      \
        typeof(*ptr) VAL;                                               \
        do {                                                            \
                VAL = READ_ONCE(*__PTR);                                \
                if (cond_expr)                                          \
                        break;                                          \
                for (;;) {                                              \
                        ___monitor(__PTR, 0, 0);                        \
                        VAL = READ_ONCE(*__PTR);                        \
                        if (cond_expr) break;                           \
                        ___mwait(0xf0 /* C0 */, 0);                     \
                }                                                       \
        } while (0)                                                     \
        smp_acquire__after_ctrl_dep();                                  \
        VAL;                                                            \
  })

which might just generate nasty enough code to not be worth it.

I dunno

             Linus