lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 13 Dec 2021 23:05:01 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     linux-arm-kernel@...ts.infradead.org, ardb@...nel.org,
        catalin.marinas@....com, juri.lelli@...hat.com,
        linux-kernel@...r.kernel.org, mingo@...hat.com, will@...nel.org
Subject: Re: [PATCH 5/6] sched/preempt: add PREEMPT_DYNAMIC using static keys

On Tue, Nov 09, 2021 at 05:24:07PM +0000, Mark Rutland wrote:
> Where an architecture selects HAVE_STATIC_CALL but not
> HAVE_STATIC_CALL_INLINE, each static call has an out-of-line trampoline
> which will either branch to a callee or return to the caller.
> 
> On such architectures, a number of constraints can conspire to make
> those trampolines more complicated and potentially less useful than we'd
> like. For example:
> 
> * Hardware and software control flow integrity schemes can require the
>   additition of "landing pad" instructions (e.g. `BTI` for arm64), which
>   will also be present at the "real" callee.
> 
> * Limited branch ranges can require that trampolines generate or load an
>   address into a registter and perform an indirect brach (or at least
>   have a slow path that does so). This loses some of the benefits of
>   having a direct branch.
> 
> * Interaction with SW CFI schemes can be complicated and fragile, e.g.
>   requiring that we can recognise idiomatic codegen and remove
>   indirections understand, at least until clang proves more helpful
>   mechanisms for dealing with this.
> 
> For PREEMPT_DYNAMIC, we don't need the full power of static calls, as we
> really only need to enable/disable specific preemption functions. We can
> achieve the same effect without a number of the pain points above by
> using static keys to fold early return cases into the preemption
> functions themselves rather than in an out-of-line trampoline,
> effectively inlining the trampoline into the start of the function.
> 
> For arm64, this results in good code generation, e.g. the
> dynamic_cond_resched() wrapper looks as follows (with the first `B` being
> replaced with a `NOP` when the function is disabled):
> 
> | <dynamic_cond_resched>:
> |        bti     c
> |        b       <dynamic_cond_resched+0x10>
> |        mov     w0, #0x0                        // #0
> |        ret
> |        mrs     x0, sp_el0
> |        ldr     x0, [x0, #8]
> |        cbnz    x0, <dynamic_cond_resched+0x8>
> |        paciasp
> |        stp     x29, x30, [sp, #-16]!
> |        mov     x29, sp
> |        bl      <preempt_schedule_common>
> |        mov     w0, #0x1                        // #1
> |        ldp     x29, x30, [sp], #16
> |        autiasp
> |        ret
> 
> ... compared to the regular form of the function:
> 
> | <__cond_resched>:
> |        bti     c
> |        mrs     x0, sp_el0
> |        ldr     x1, [x0, #8]
> |        cbz     x1, <__cond_resched+0x18>
> |        mov     w0, #0x0                        // #0
> |        ret
> |        paciasp
> |        stp     x29, x30, [sp, #-16]!
> |        mov     x29, sp
> |        bl      <preempt_schedule_common>
> |        mov     w0, #0x1                        // #1
> |        ldp     x29, x30, [sp], #16
> |        autiasp
> |        ret
> 
> Any architecture which implements static keys should be able to use this
> to implement PREEMPT_DYNAMIC with similar cost to non-inlined static
> calls.
> 
> Signed-off-by: Mark Rutland <mark.rutland@....com>
> Cc: Ard Biesheuvel <ardb@...nel.org>
> Cc: Frederic Weisbecker <frederic@...nel.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Juri Lelli <juri.lelli@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>

Anyone has an opinion on that? Can we do better on the arm64 static call side
or should we resign ourself to using that static keys direction?

Also I assume that, sooner or later, arm64 will eventually need a static call
implementation....

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ