lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 3 Apr 2017 17:43:05 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nicholas Piggin <npiggin@...il.com>
Cc:     "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Anton Blanchard <anton@...ba.org>,
        linuxppc-dev <linuxppc-dev@...abs.org>
Subject: Re: [RFC][PATCH] spin loop arch primitives for busy waiting

On Mon, Apr 3, 2017 at 4:50 PM, Nicholas Piggin <npiggin@...il.com> wrote:
>
> POWER does not have an instruction like pause. We can only set current
> thread priority, and current implementations do something like allocate
> issue cycles to threads based on relative priorities. So there should
> be at least one or two issue cycles at low priority, but ideally we
> would not be changing priority in the busy-wait loop because it can
> impact other threads in the core.
>
> I couldn't think of a good way to improve cpu_relax. Our (open source)
> firmware has a cpu_relax, and it puts a bunch of nops between low and
> normal priority instructions so we get some fetch cycles at low prio.
> That isn't ideal though.
>
> If you have any ideas, I'd be open to them.

So the idea would be that maybe we can just make those things
explicit. IOW, instead of having that magical looping construct that
does other magical hidden things as part of the loop, maybe we can
just have a

   begin_cpu_relax();
   while (!cond)
       cpu_relax();
   end_cpu_relax();

and then architectures can decide how they implement it. So for x86,
the begin/end macros would be empty. For ppc, maybe begin/end would be
the "lower and raise priority", while cpu_relax() itself is an empty
thing.

Or maybe "begin" just clears a counter, while "cpu_relax()" does some
"increase iterations, and lower priority after X iterations", and then
"end" raises the priority again.

The "do magic having a special loop" approach disturbs me. I'd much
rather have more explicit hooks that allow people to do their own loop
semantics (including having a "return" to exit early).

But that depends on architectures having some pattern that we *can*
abstract. Would some "begin/in-loop/end" pattern like the above be
sufficient? The pure "in-loop" case we have now (ie "cpu_relax()"
clearly isn't sufficient.

I think s390 might have issues too, since they tried to have that
"cpu_relax_yield" thing (which is only used by stop_machine), and
they've tried cpu_relax_lowlatency() and other games.

                    Linus

Powered by blists - more mailing lists