[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <447d4ee9-bf63-44f0-a064-fea5a5f1acef@os.amperecomputing.com>
Date: Tue, 5 Aug 2025 11:37:38 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Ryan Roberts <ryan.roberts@....com>, will@...nel.org,
catalin.marinas@....com, akpm@...ux-foundation.org, Miko.Lenczewski@....com,
dev.jain@....com, scott@...amperecomputing.com, cl@...two.org
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v6 0/4] arm64: support FEAT_BBM level 2 and large
block mapping when rodata=full
Hi Ryan
On 8/5/25 1:13 AM, Ryan Roberts wrote:
> Hi All,
>
> This is a new version built on top of Yang Shi's work at [1]. Yang and I have
> been discussing (disagreeing?) about the best way to implement the last 2
> patches. So I've reworked them and am posting as RFC to illustrate how I think
> this feature should be implemented, but I've retained Yang as primary author
> since it is all based on his work. I'd appreciate feedback from Catalin and/or
> Will on whether this is the right approach, so that hopefully we can get this
> into shape for 6.18.
>
> The first 2 patches are unchanged from Yang's v5; the first patch comes from Dev
> and the rest of the series depends upon it.
Thank you for making the prototype and retaining me as primary author.
The approach is basically fine to me. But there are some minor concerns.
Some of them were raised in the comment for patch #3 and patch #4. I put
them together here.
1. Walk page table twice. This has been discussed before. It is not
very efficient for small split, for example, 4K. Unfortunately, the most
split is still 4K in the current kernel AFAICT.
Hopefully this can be mitigated by some new development, for
example, ROX cache.
2. Take mutex lock twice and do lazy mmu twice. I think it is easy
to resolve as I suggested in patch #3.
3. Walk every PTE and call split on every PTE for repainting. It is
not very efficient, but may be ok for repainting since it is just called
once at boot time.
I don't think these concerns are major blockers IMHO. Anyway let's see
what Catalin and/or Will think about this.
Regards,
Yang
>
> I've tested this on an AmpereOne system (a VM with 12G RAM) in all 3 possible
> modes by hacking the BBML2 feature detection code:
>
> - mode 1: All CPUs support BBML2 so the linear map uses large mappings
> - mode 2: Boot CPU does not support BBML2 so linear map uses pte mappings
> - mode 3: Boot CPU supports BBML2 but secondaries do not so linear map
> initially uses large mappings but is then repainted to use pte mappings
>
> In all cases, mm selftests run and no regressions are observed. In all cases,
> ptdump of linear map is as expected:
>
> Mode 1:
> =======
> ---[ Linear Mapping start ]---
> 0xffff000000000000-0xffff000000200000 2M PMD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000000200000-0xffff000000210000 64K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000000210000-0xffff000000400000 1984K PTE ro NX SHD AF UXN MEM/NORMAL
> 0xffff000000400000-0xffff000002400000 32M PMD ro NX SHD AF BLK UXN MEM/NORMAL
> 0xffff000002400000-0xffff000002550000 1344K PTE ro NX SHD AF UXN MEM/NORMAL
> 0xffff000002550000-0xffff000002600000 704K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000002600000-0xffff000004000000 26M PMD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000004000000-0xffff000040000000 960M PMD RW NX SHD AF CON BLK UXN MEM/NORMAL-TAGGED
> 0xffff000040000000-0xffff000140000000 4G PUD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000140000000-0xffff000142000000 32M PMD RW NX SHD AF CON BLK UXN MEM/NORMAL-TAGGED
> 0xffff000142000000-0xffff000142120000 1152K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000142120000-0xffff000142128000 32K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142128000-0xffff000142159000 196K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142159000-0xffff000142160000 28K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142160000-0xffff000142240000 896K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000142240000-0xffff00014224e000 56K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff00014224e000-0xffff000142250000 8K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142250000-0xffff000142260000 64K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142260000-0xffff000142280000 128K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000142280000-0xffff000142288000 32K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142288000-0xffff000142290000 32K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142290000-0xffff0001422a0000 64K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff0001422a0000-0xffff000142465000 1812K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142465000-0xffff000142470000 44K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000142470000-0xffff000142600000 1600K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000142600000-0xffff000144000000 26M PMD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000144000000-0xffff000180000000 960M PMD RW NX SHD AF CON BLK UXN MEM/NORMAL-TAGGED
> 0xffff000180000000-0xffff000181a00000 26M PMD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000181a00000-0xffff000181b90000 1600K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000181b90000-0xffff000181b9d000 52K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181b9d000-0xffff000181c80000 908K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181c80000-0xffff000181c90000 64K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181c90000-0xffff000181ca0000 64K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000181ca0000-0xffff000181dbd000 1140K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181dbd000-0xffff000181dc0000 12K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181dc0000-0xffff000181e00000 256K PTE RW NX SHD AF CON UXN MEM/NORMAL-TAGGED
> 0xffff000181e00000-0xffff000182000000 2M PMD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000182000000-0xffff0001c0000000 992M PMD RW NX SHD AF CON BLK UXN MEM/NORMAL-TAGGED
> 0xffff0001c0000000-0xffff000300000000 5G PUD RW NX SHD AF BLK UXN MEM/NORMAL-TAGGED
> 0xffff000300000000-0xffff008000000000 500G PUD
> 0xffff008000000000-0xffff800000000000 130560G PGD
> ---[ Linear Mapping end ]---
>
> Mode 3:
> =======
> ---[ Linear Mapping start ]---
> 0xffff000000000000-0xffff000000210000 2112K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000000210000-0xffff000000400000 1984K PTE ro NX SHD AF UXN MEM/NORMAL
> 0xffff000000400000-0xffff000002400000 32M PMD ro NX SHD AF BLK UXN MEM/NORMAL
> 0xffff000002400000-0xffff000002550000 1344K PTE ro NX SHD AF UXN MEM/NORMAL
> 0xffff000002550000-0xffff000143a61000 5264452K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000143a61000-0xffff000143c61000 2M PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000143c61000-0xffff000181b9a000 1015012K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181b9a000-0xffff000181d9a000 2M PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000181d9a000-0xffff000300000000 6261144K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
> 0xffff000300000000-0xffff008000000000 500G PUD
> 0xffff008000000000-0xffff800000000000 130560G PGD
> ---[ Linear Mapping end ]---
>
> [1] https://lore.kernel.org/linux-arm-kernel/20250724221216.1998696-1-yang@os.amperecomputing.com/
>
> Thanks,
> Ryan
>
> Dev Jain (1):
> arm64: Enable permission change on arm64 kernel block mappings
>
> Yang Shi (3):
> arm64: cpufeature: add AmpereOne to BBML2 allow list
> arm64: mm: support large block mapping when rodata=full
> arm64: mm: split linear mapping if BBML2 unsupported on secondary CPUs
>
> arch/arm64/include/asm/cpufeature.h | 2 +
> arch/arm64/include/asm/mmu.h | 4 +
> arch/arm64/include/asm/pgtable.h | 5 +
> arch/arm64/kernel/cpufeature.c | 17 +-
> arch/arm64/mm/mmu.c | 368 +++++++++++++++++++++++++++-
> arch/arm64/mm/pageattr.c | 161 +++++++++---
> arch/arm64/mm/proc.S | 25 +-
> include/linux/pagewalk.h | 3 +
> mm/pagewalk.c | 24 ++
> 9 files changed, 566 insertions(+), 43 deletions(-)
>
> --
> 2.43.0
>
Powered by blists - more mailing lists