lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 6 Dec 2018 19:18:46 +0000
From:   Will Deacon <will.deacon@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        ard.biesheuvel@...aro.org, catalin.marinas@....com, rml@...h9.net,
        tglx@...utronix.de, schwidefsky@...ibm.com
Subject: Re: [PATCH v2 0/2] arm64: Only call into preempt_schedule() if
 need_resched()

Hi Peter,

On Thu, Dec 06, 2018 at 04:08:50PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 30, 2018 at 05:34:29PM +0000, Will Deacon wrote:
> > This is version two of the patches I originally posted here:
> > 
> >   http://lkml.kernel.org/r/1543347902-21170-1-git-send-email-will.deacon@arm.com
> > 
> > The only change since v1 is that  __preempt_count_dec_and_test() now
> > reloads the need_resched flag if it initially saw that it was set. This
> > resolves the issue spotted by Peter, where an IRQ coming in during the
> > decrement can cause a reschedule to be missed.
> 
> Yes, I think this one will work, so:
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>

Thanks!

> However, this leaves me wondering if the sequence is actually much
> better than what you had?
> 
> I suppose there's a win due to cache locality -- you only have to load a
> single line -- but I'm thinking that on pure instruction count, you're
> not actually winning much.

The fast path is still slightly shorter in terms of executed instructions,
but you're right that the win is likely to be because everything hits in the
cache or the store buffer when we're not preempting, so we should run
through the code reasonably quickly and avoid the unconditional call to
preempt_schedule().

Will

--->8

// Before
  20:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
  24:   910003fd        mov     x29, sp
  28:   d5384101        mrs     x1, sp_el0
  2c:   b9401020        ldr     w0, [x1, #16]
  30:   51000400        sub     w0, w0, #0x1
  34:   b9001020        str     w0, [x1, #16]
  38:   350000a0        cbnz    w0, 4c <preempt_enable+0x2c>
  3c:   f9400020        ldr     x0, [x1]
  40:   721f001f        tst     w0, #0x2
  44:   54000040        b.eq    4c <preempt_enable+0x2c>  // b.none
  48:   94000000        bl      0 <preempt_schedule>
  4c:   a8c17bfd        ldp     x29, x30, [sp], #16
  50:   d65f03c0        ret

// After
  20:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
  24:   910003fd        mov     x29, sp
  28:   d5384101        mrs     x1, sp_el0
  2c:   f9400820        ldr     x0, [x1, #16]
  30:   d1000400        sub     x0, x0, #0x1
  34:   b9001020        str     w0, [x1, #16]
  38:   b5000080        cbnz    x0, 48 <preempt_enable+0x28>
  3c:   94000000        bl      0 <preempt_schedule>
  40:   a8c17bfd        ldp     x29, x30, [sp], #16
  44:   d65f03c0        ret
  48:   f9400820        ldr     x0, [x1, #16]
  4c:   b5ffffa0        cbnz    x0, 40 <preempt_enable+0x20>
  50:   94000000        bl      0 <preempt_schedule>
  54:   17fffffb        b       40 <will_preempt_enable+0x20>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ