[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240326101448.3453626-1-ryan.roberts@arm.com>
Date: Tue, 26 Mar 2024 10:14:45 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Ard Biesheuvel <ardb@...nel.org>,
David Hildenbrand <david@...hat.com>,
Donald Dutile <ddutile@...hat.com>,
Eric Chanudet <echanude@...hat.com>
Cc: Ryan Roberts <ryan.roberts@....com>,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: [PATCH v1 0/3] Speed up boot with faster linear map creation
Hi All,
It turns out that creating the linear map can take a significant proportion of
the total boot time, especially when rodata=full. And a large portion of the
time it takes to create the linear map is issuing TLBIs. This series reworks the
kernel pgtable generation code to significantly reduce the number of TLBIs. See
each patch for details.
The below shows the execution time of map_mem() across a couple of different
systems with different RAM configurations. We measure after applying each patch
and show the improvement relative to base (v6.9-rc1):
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%)
no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%)
no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%)
lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%)
This series applies on top of v6.9-rc1. All mm selftests pass. I haven't yet
tested all VA size configs (although I don't anticipate any issues); I'll do
this as part of followup.
Thanks,
Ryan
Ryan Roberts (3):
arm64: mm: Don't remap pgtables per- cont(pte|pmd) block
arm64: mm: Don't remap pgtables for allocate vs populate
arm64: mm: Lazily clear pte table mappings from fixmap
arch/arm64/include/asm/fixmap.h | 5 +-
arch/arm64/include/asm/mmu.h | 8 +
arch/arm64/include/asm/pgtable.h | 4 -
arch/arm64/kernel/cpufeature.c | 10 +-
arch/arm64/mm/fixmap.c | 11 +
arch/arm64/mm/mmu.c | 364 +++++++++++++++++++++++--------
include/linux/pgtable.h | 8 +
7 files changed, 307 insertions(+), 103 deletions(-)
--
2.25.1
Powered by blists - more mailing lists