[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <636c8901-6e05-479f-ae06-ee391d4d36e8@arm.com>
Date: Wed, 25 Jun 2025 14:07:23 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Catalin Marinas <catalin.marinas@....com>,
Mikołaj Lenczewski <miko.lenczewski@....com>
Cc: yang@...amperecomputing.com, will@...nel.org, jean-philippe@...aro.org,
robin.murphy@....com, joro@...tes.org, maz@...nel.org,
oliver.upton@...ux.dev, joey.gouly@....com, james.morse@....com,
broonie@...nel.org, ardb@...nel.org, baohua@...nel.org,
suzuki.poulose@....com, david@...hat.com, jgg@...pe.ca, nicolinc@...dia.com,
jsnitsel@...hat.com, mshavit@...gle.com, kevin.tian@...el.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev
Subject: Re: [PATCH v7 4/4] arm64/mm: Elide tlbi in contpte_convert() under
BBML2
On 20/06/2025 17:10, Ryan Roberts wrote:
> On 19/06/2025 20:29, Catalin Marinas wrote:
>> On Tue, Jun 17, 2025 at 09:51:04AM +0000, Mikołaj Lenczewski wrote:
>>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>>> index bcac4f55f9c1..203357061d0a 100644
>>> --- a/arch/arm64/mm/contpte.c
>>> +++ b/arch/arm64/mm/contpte.c
>>> @@ -68,7 +68,144 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
>>> pte = pte_mkyoung(pte);
>>> }
>>>
>>> - __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
>>> + /*
>>> + * On eliding the __tlb_flush_range() under BBML2+noabort:
>>> + *
>>> + * NOTE: Instead of using N=16 as the contiguous block length, we use
>>> + * N=4 for clarity.
>>> + *
>>> + * NOTE: 'n' and 'c' are used to denote the "contiguous bit" being
>>> + * unset and set, respectively.
>>> + *
>>> + * We worry about two cases where contiguous bit is used:
>>> + * - When folding N smaller non-contiguous ptes as 1 contiguous block.
>>> + * - When unfolding a contiguous block into N smaller non-contiguous ptes.
>>> + *
>>> + * Currently, the BBML0 folding case looks as follows:
>>> + *
>>> + * 0) Initial page-table layout:
>>> + *
>>> + * +----+----+----+----+
>>> + * |RO,n|RO,n|RO,n|RW,n| <--- last page being set as RO
>>> + * +----+----+----+----+
>>> + *
>>> + * 1) Aggregate AF + dirty flags using __ptep_get_and_clear():
>>> + *
>>> + * +----+----+----+----+
>>> + * | 0 | 0 | 0 | 0 |
>>> + * +----+----+----+----+
>>> + *
>>> + * 2) __flush_tlb_range():
>>> + *
>>> + * |____ tlbi + dsb ____|
>>> + *
>>> + * 3) __set_ptes() to repaint contiguous block:
>>> + *
>>> + * +----+----+----+----+
>>> + * |RO,c|RO,c|RO,c|RO,c|
>>> + * +----+----+----+----+
>>
>> From the initial layout to point (3), we are also changing the
>> permission. Given the rules you mentioned in the Arm ARM, I think that's
>> safe (hardware seeing either the old or the new attributes). The
>> FEAT_BBM description, however, only talks about change between larger
>> and smaller blocks but no mention of also changing the attributes at the
>> same time. Hopefully the microarchitects claiming certain CPUs don't
>> generate conflict aborts understood what Linux does.
I think what you are saying is that despite going via invalid, the HW may see
this direct transition:
+----+----+----+----+
|RO,n|RO,n|RO,n|RW,n|
+----+----+----+----+
to:
+----+----+----+----+
|RO,c|RO,c|RO,c|RO,c|
+----+----+----+----+
There are 2 logical operations here. The first is changing the permissions of
the last entry. The second is changing the size of the entry.
As I understand it, it's permissible in the architecture to update the
permissions of the a PTE without break-before-make and without issuing a tlbi
afterwards; in that case the HW may apply either the old permissions or the new
permissions up until a future tlbi (after which the new permissions are
guarranteed). That's the first logical operation.
The second logical operation is permitted by FEAT_BBM level 2.
So both logical operations are permitted and the Arm ARM doesn't mention any
requirement to "separate" these operations with a tlbi or anything else, as far
as I can see.
So I would interpret that combining these 2 in the way we have should be safe.
RNGLXZ and RJQQTC give further insight into the spirit of the spec. But I agree
this isn't spelled out super clearly.
Perhaps we can move forwards based on this understanding, and I will seek some
clarifying words to be added to the Arm ARM?
Thanks,
Ryan
Powered by blists - more mailing lists