lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Sep 2016 14:25:48 -0400
From:   bdegraaf@...eaurora.org
To:     Robin Murphy <robin.murphy@....com>
Cc:     catalin.marinas@....com, will.deacon@....com, mark.rutland@....com,
        jungseoklee85@...il.com, andre.przywara@....com,
        timur@...eaurora.org, linux-kernel@...r.kernel.org,
        james.morse@....com, apinski@...ium.com, labbott@...hat.com,
        linux-arm-kernel@...ts.infradead.org, cov@...eaurora.org
Subject: Re: [RFC] arm64: Ensure proper addressing for ldnp/stnp

On 2016-09-19 14:01, Robin Murphy wrote:
> On 19/09/16 18:36, Brent DeGraaf wrote:
>> According to section 6.3.8 of the ARM Programmer's Guide, non-temporal
>> loads and stores do not verify that address dependency is met between 
>> a
>> load of an address to a register and a subsequent non-temporal load or
>> store using that address on the executing PE. Therefore, context 
>> switch
>> code and subroutine calls that use non-temporally accessed addresses 
>> as
>> parameters that might depend on a load of an address into an argument
>> register must ensure that ordering requirements are met by introducing
>> a barrier prior to the successive non-temporal access.  Add 
>> appropriate
>> barriers whereever this specific situation comes into play.
>> 
>> Signed-off-by: Brent DeGraaf <bdegraaf@...eaurora.org>
>> ---
>>  arch/arm64/kernel/entry.S  | 1 +
>>  arch/arm64/lib/copy_page.S | 2 ++
>>  2 files changed, 3 insertions(+)
>> 
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 441420c..982c4d3 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -679,6 +679,7 @@ ENTRY(cpu_switch_to)
>>  	ldp	x27, x28, [x8], #16
>>  	ldp	x29, x9, [x8], #16
>>  	ldr	lr, [x8]
>> +	dmb	nshld	// Existence of instructions with loose load-use 
>> dependencies (e.g. ldnp/stnp) make this barrier necessary
>>  	mov	sp, x9
>>  	and	x9, x9, #~(THREAD_SIZE - 1)
>>  	msr	sp_el0, x9
>> diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
>> index 4c1e700..21c6892 100644
>> --- a/arch/arm64/lib/copy_page.S
>> +++ b/arch/arm64/lib/copy_page.S
>> @@ -47,6 +47,8 @@ alternative_endif
>>  	ldp	x14, x15, [x1, #96]
>>  	ldp	x16, x17, [x1, #112]
>> 
>> +	dmb	nshld // In case x0 (for stnp) is dependent on a load
> 
> The ARMv8 ARM (B2.7.2 in issue j) says that when an address dependency
> exists between a load and a subsequent LDNP, *other* observers may
> observe those accesses in any order. How's that related to an STNP on
> the same CPU?
> 
> Robin.
> 
>> +
>>  	mov	x18, #(PAGE_SIZE - 128)
>>  	add	x1, x1, #128
>>  1:
>> 

Yes, I have seen the section in the ARM ARM about this. But the 
Programmer's Guide goes further, even providing a concrete example:

"Non-temporal loads and stores relax the memory ordering 
requirements...the LDNP instruction might
be observed before the preceding LDR instruction, which can result in 
reading from an unpredictable
address in X0.

For example:
LDR X0, [X3]
LDNP X2, X1, [X0]
To correct the above, you need an explicit load barrier:
LDR X0, [X3]
DMB NSHLD
LDNP X2, X1, [X0]"

Did the ARM ARM leave this out?  Or is the Programmer's Guide section 
incorrect?

Thanks for your comments,
Brent

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ