lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <636c8901-6e05-479f-ae06-ee391d4d36e8@arm.com>
Date: Wed, 25 Jun 2025 14:07:23 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Catalin Marinas <catalin.marinas@....com>,
 Mikołaj Lenczewski <miko.lenczewski@....com>
Cc: yang@...amperecomputing.com, will@...nel.org, jean-philippe@...aro.org,
 robin.murphy@....com, joro@...tes.org, maz@...nel.org,
 oliver.upton@...ux.dev, joey.gouly@....com, james.morse@....com,
 broonie@...nel.org, ardb@...nel.org, baohua@...nel.org,
 suzuki.poulose@....com, david@...hat.com, jgg@...pe.ca, nicolinc@...dia.com,
 jsnitsel@...hat.com, mshavit@...gle.com, kevin.tian@...el.com,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 iommu@...ts.linux.dev
Subject: Re: [PATCH v7 4/4] arm64/mm: Elide tlbi in contpte_convert() under
 BBML2

On 20/06/2025 17:10, Ryan Roberts wrote:
> On 19/06/2025 20:29, Catalin Marinas wrote:
>> On Tue, Jun 17, 2025 at 09:51:04AM +0000, Mikołaj Lenczewski wrote:
>>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>>> index bcac4f55f9c1..203357061d0a 100644
>>> --- a/arch/arm64/mm/contpte.c
>>> +++ b/arch/arm64/mm/contpte.c
>>> @@ -68,7 +68,144 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
>>>  			pte = pte_mkyoung(pte);
>>>  	}
>>>  
>>> -	__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
>>> +	/*
>>> +	 * On eliding the __tlb_flush_range() under BBML2+noabort:
>>> +	 *
>>> +	 * NOTE: Instead of using N=16 as the contiguous block length, we use
>>> +	 *       N=4 for clarity.
>>> +	 *
>>> +	 * NOTE: 'n' and 'c' are used to denote the "contiguous bit" being
>>> +	 *       unset and set, respectively.
>>> +	 *
>>> +	 * We worry about two cases where contiguous bit is used:
>>> +	 *  - When folding N smaller non-contiguous ptes as 1 contiguous block.
>>> +	 *  - When unfolding a contiguous block into N smaller non-contiguous ptes.
>>> +	 *
>>> +	 * Currently, the BBML0 folding case looks as follows:
>>> +	 *
>>> +	 *  0) Initial page-table layout:
>>> +	 *
>>> +	 *   +----+----+----+----+
>>> +	 *   |RO,n|RO,n|RO,n|RW,n| <--- last page being set as RO
>>> +	 *   +----+----+----+----+
>>> +	 *
>>> +	 *  1) Aggregate AF + dirty flags using __ptep_get_and_clear():
>>> +	 *
>>> +	 *   +----+----+----+----+
>>> +	 *   |  0 |  0 |  0 |  0 |
>>> +	 *   +----+----+----+----+
>>> +	 *
>>> +	 *  2) __flush_tlb_range():
>>> +	 *
>>> +	 *   |____ tlbi + dsb ____|
>>> +	 *
>>> +	 *  3) __set_ptes() to repaint contiguous block:
>>> +	 *
>>> +	 *   +----+----+----+----+
>>> +	 *   |RO,c|RO,c|RO,c|RO,c|
>>> +	 *   +----+----+----+----+
>>
>> From the initial layout to point (3), we are also changing the
>> permission. Given the rules you mentioned in the Arm ARM, I think that's
>> safe (hardware seeing either the old or the new attributes). The
>> FEAT_BBM description, however, only talks about change between larger
>> and smaller blocks but no mention of also changing the attributes at the
>> same time. Hopefully the microarchitects claiming certain CPUs don't
>> generate conflict aborts understood what Linux does.

I think what you are saying is that despite going via invalid, the HW may see
this direct transition:

+----+----+----+----+
|RO,n|RO,n|RO,n|RW,n|
+----+----+----+----+
to:
+----+----+----+----+
|RO,c|RO,c|RO,c|RO,c|
+----+----+----+----+

There are 2 logical operations here. The first is changing the permissions of
the last entry. The second is changing the size of the entry.

As I understand it, it's permissible in the architecture to update the
permissions of the a PTE without break-before-make and without issuing a tlbi
afterwards; in that case the HW may apply either the old permissions or the new
permissions up until a future tlbi (after which the new permissions are
guarranteed). That's the first logical operation.

The second logical operation is permitted by FEAT_BBM level 2.

So both logical operations are permitted and the Arm ARM doesn't mention any
requirement to "separate" these operations with a tlbi or anything else, as far
as I can see.

So I would interpret that combining these 2 in the way we have should be safe.
RNGLXZ and RJQQTC give further insight into the spirit of the spec. But I agree
this isn't spelled out super clearly.

Perhaps we can move forwards based on this understanding, and I will seek some
clarifying words to be added to the Arm ARM?

Thanks,
Ryan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ