lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <19d2b1c6-ef45-49f5-b11b-a57adc522852@arm.com>
Date: Tue, 15 Apr 2025 18:28:14 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Will Deacon <will@...nel.org>, Pasha Tatashin
 <pasha.tatashin@...een.com>, Andrew Morton <akpm@...ux-foundation.org>,
 Uladzislau Rezki <urezki@...il.com>, Christoph Hellwig <hch@...radead.org>,
 David Hildenbrand <david@...hat.com>,
 "Matthew Wilcox (Oracle)" <willy@...radead.org>,
 Mark Rutland <mark.rutland@....com>,
 Anshuman Khandual <anshuman.khandual@....com>,
 Alexandre Ghiti <alexghiti@...osinc.com>,
 Kevin Brodsky <kevin.brodsky@....com>, linux-arm-kernel@...ts.infradead.org,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel
 mappings

On 15/04/2025 11:51, Catalin Marinas wrote:
> On Mon, Apr 14, 2025 at 07:28:46PM +0100, Ryan Roberts wrote:
>> On 14/04/2025 18:38, Catalin Marinas wrote:
>>> On Tue, Mar 04, 2025 at 03:04:41PM +0000, Ryan Roberts wrote:
>>>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>>>> index 1898c3069c43..149df945c1ab 100644
>>>> --- a/arch/arm64/include/asm/pgtable.h
>>>> +++ b/arch/arm64/include/asm/pgtable.h
>>>> @@ -40,6 +40,55 @@
>>>>  #include <linux/sched.h>
>>>>  #include <linux/page_table_check.h>
>>>>  
>>>> +static inline void emit_pte_barriers(void)
>>>> +{
>>>> +	/*
>>>> +	 * These barriers are emitted under certain conditions after a pte entry
>>>> +	 * was modified (see e.g. __set_pte_complete()). The dsb makes the store
>>>> +	 * visible to the table walker. The isb ensures that any previous
>>>> +	 * speculative "invalid translation" marker that is in the CPU's
>>>> +	 * pipeline gets cleared, so that any access to that address after
>>>> +	 * setting the pte to valid won't cause a spurious fault. If the thread
>>>> +	 * gets preempted after storing to the pgtable but before emitting these
>>>> +	 * barriers, __switch_to() emits a dsb which ensure the walker gets to
>>>> +	 * see the store. There is no guarrantee of an isb being issued though.
>>>> +	 * This is safe because it will still get issued (albeit on a
>>>> +	 * potentially different CPU) when the thread starts running again,
>>>> +	 * before any access to the address.
>>>> +	 */
>>>> +	dsb(ishst);
>>>> +	isb();
>>>> +}
>>>> +
>>>> +static inline void queue_pte_barriers(void)
>>>> +{
>>>> +	if (test_thread_flag(TIF_LAZY_MMU))
>>>> +		set_thread_flag(TIF_LAZY_MMU_PENDING);
>>>
>>> As we can have lots of calls here, it might be slightly cheaper to test
>>> TIF_LAZY_MMU_PENDING and avoid setting it unnecessarily.
>>
>> Yes, good point.
>>
>>> I haven't checked - does the compiler generate multiple mrs from sp_el0
>>> for subsequent test_thread_flag()?
>>
>> It emits a single mrs but it loads from the pointer twice.
> 
> It's not that bad if only do the set_thread_flag() once.
> 
>> I think v3 is the version we want?
>>
>>
>> void TEST_queue_pte_barriers_v1(void)
>> {
>> 	if (test_thread_flag(TIF_LAZY_MMU))
>> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
>> 	else
>> 		emit_pte_barriers();
>> }
>>
>> void TEST_queue_pte_barriers_v2(void)
>> {
>> 	if (test_thread_flag(TIF_LAZY_MMU) &&
>> 	    !test_thread_flag(TIF_LAZY_MMU_PENDING))
>> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
>> 	else
>> 		emit_pte_barriers();
>> }
>>
>> void TEST_queue_pte_barriers_v3(void)
>> {
>> 	unsigned long flags = read_thread_flags();
>>
>> 	if ((flags & (_TIF_LAZY_MMU | _TIF_LAZY_MMU_PENDING)) == _TIF_LAZY_MMU)
>> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
>> 	else
>> 		emit_pte_barriers();
>> }
> 
> Doesn't v3 emit barriers once _TIF_LAZY_MMU_PENDING has been set? We
> need something like:
> 
> 	if (flags & _TIF_LAZY_MMU) {
> 		if (!(flags & _TIF_LAZY_MMU_PENDING))
> 			set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	} else {
> 		emit_pte_barriers();
> 	}

Gah, yeah sorry, going to quickly. v2 is also logicially incorrect.

Fixed versions:

void TEST_queue_pte_barriers_v2(void)
{
	if (test_thread_flag(TIF_LAZY_MMU)) {
		if (!test_thread_flag(TIF_LAZY_MMU_PENDING))
			set_thread_flag(TIF_LAZY_MMU_PENDING);
	} else {
		emit_pte_barriers();
	}
}

void TEST_queue_pte_barriers_v3(void)
{
	unsigned long flags = read_thread_flags();

	if (flags & BIT(TIF_LAZY_MMU)) {
		if (!(flags & BIT(TIF_LAZY_MMU_PENDING)))
			set_thread_flag(TIF_LAZY_MMU_PENDING);
	} else {
		emit_pte_barriers();
	}
}

000000000000105c <TEST_queue_pte_barriers_v2>:
    105c:	d5384100 	mrs	x0, sp_el0
    1060:	f9400001 	ldr	x1, [x0]
    1064:	37f80081 	tbnz	w1, #31, 1074 <TEST_queue_pte_barriers_v2+0x18>
    1068:	d5033a9f 	dsb	ishst
    106c:	d5033fdf 	isb
    1070:	d65f03c0 	ret
    1074:	f9400001 	ldr	x1, [x0]
    1078:	b707ffc1 	tbnz	x1, #32, 1070 <TEST_queue_pte_barriers_v2+0x14>
    107c:	14000004 	b	108c <TEST_queue_pte_barriers_v2+0x30>
    1080:	d2c00021 	mov	x1, #0x100000000           	// #4294967296
    1084:	f821301f 	stset	x1, [x0]
    1088:	d65f03c0 	ret
    108c:	f9800011 	prfm	pstl1strm, [x0]
    1090:	c85f7c01 	ldxr	x1, [x0]
    1094:	b2600021 	orr	x1, x1, #0x100000000
    1098:	c8027c01 	stxr	w2, x1, [x0]
    109c:	35ffffa2 	cbnz	w2, 1090 <TEST_queue_pte_barriers_v2+0x34>
    10a0:	d65f03c0 	ret

00000000000010a4 <TEST_queue_pte_barriers_v3>:
    10a4:	d5384101 	mrs	x1, sp_el0
    10a8:	f9400020 	ldr	x0, [x1]
    10ac:	36f80060 	tbz	w0, #31, 10b8 <TEST_queue_pte_barriers_v3+0x14>
    10b0:	b60000a0 	tbz	x0, #32, 10c4 <TEST_queue_pte_barriers_v3+0x20>
    10b4:	d65f03c0 	ret
    10b8:	d5033a9f 	dsb	ishst
    10bc:	d5033fdf 	isb
    10c0:	d65f03c0 	ret
    10c4:	14000004 	b	10d4 <TEST_queue_pte_barriers_v3+0x30>
    10c8:	d2c00020 	mov	x0, #0x100000000           	// #4294967296
    10cc:	f820303f 	stset	x0, [x1]
    10d0:	d65f03c0 	ret
    10d4:	f9800031 	prfm	pstl1strm, [x1]
    10d8:	c85f7c20 	ldxr	x0, [x1]
    10dc:	b2600000 	orr	x0, x0, #0x100000000
    10e0:	c8027c20 	stxr	w2, x0, [x1]
    10e4:	35ffffa2 	cbnz	w2, 10d8 <TEST_queue_pte_barriers_v3+0x34>
    10e8:	d65f03c0 	ret

So v3 is the way to go, I think; it's a single mrs and a single ldr.

I'll get this fixed up and posted early next week.

Thanks,
Ryan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ