linux-kernel - Re: [PATCH v5 06/27] arm64: Delay daif masking for user return

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <59fa96d5-6bfa-c3aa-94d6-5941a7576bfa@arm.com>
Date:   Wed, 12 Sep 2018 11:31:24 +0100
From:   James Morse <james.morse@....com>
To:     Julien Thierry <julien.thierry@....com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        daniel.thompson@...aro.org, joel@...lfernandes.org,
        marc.zyngier@....com, mark.rutland@....com,
        christoffer.dall@....com, catalin.marinas@....com,
        will.deacon@....com
Subject: Re: [PATCH v5 06/27] arm64: Delay daif masking for user return

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> Masking daif flags is done very early before returning to EL0.
> 
> Only toggle the interrupt masking while in the vector entry and mask daif
> once in kernel_exit.

I had an earlier version that did this, but it showed up as a performance
problem. commit 8d66772e869e ("arm64: Mask all exceptions during kernel_exit")
described it as:
|    Adding a naked 'disable_daif' to kernel_exit causes a performance problem
|    for micro-benchmarks that do no real work, (e.g. calling getpid() in a
|    loop). This is because the ret_to_user loop has already masked IRQs so
|    that the TIF_WORK_MASK thread flags can't change underneath it, adding
|    disable_daif is an additional self-synchronising operation.
|
|    In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
|    flags from an SError, in which case the ret_to_user loop must mask SError
|    while it examines the flags.

We may decide that the benchmark is silly, and we don't care about this. (At the
time it was easy enough to work around).

We need regular-IRQs masked when we read the TIF flags, and to stay masked until
we return to user-space.
I assume you're changing this so that psuedo-NMI are unmasked for EL0 until
kernel_exit.

I'd like to be able to change the TIF flags from the SError handlers for RAS,
which means masking SError for do_notify_resume too. (The RAS code that does
this doesn't exist today, so you can make this my problem to work out later!)
I think we should have psuedo_NMI masked if SError is masked too.

Is there a strong reason for having psuedo-NMI unmasked during
do_notify_resume(), or is it just for having the maximum amount of code exposed?

Thanks,

James

> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 09dbea22..85ce06ac 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -259,9 +259,9 @@ alternative_else_nop_endif
>  	.endm
>  
>  	.macro	kernel_exit, el
> -	.if	\el != 0
>  	disable_daif
>  
> +	.if	\el != 0
>  	/* Restore the task's original addr_limit. */
>  	ldr	x20, [sp, #S_ORIG_ADDR_LIMIT]
>  	str	x20, [tsk, #TSK_TI_ADDR_LIMIT]
> @@ -896,7 +896,7 @@ work_pending:
>   * "slow" syscall return path.
>   */
>  ret_to_user:
> -	disable_daif
> +	disable_irq				// disable interrupts
>  	ldr	x1, [tsk, #TSK_TI_FLAGS]
>  	and	x2, x1, #_TIF_WORK_MASK
>  	cbnz	x2, work_pending
>