[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2d3e684-e3dc-41b5-9708-ca5926c55ebf@arm.com>
Date: Wed, 6 Aug 2025 08:20:09 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Yang Shi <yang@...amperecomputing.com>, will@...nel.org,
catalin.marinas@....com, akpm@...ux-foundation.org, Miko.Lenczewski@....com,
dev.jain@....com, scott@...amperecomputing.com, cl@...two.org
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] arm64: mm: support large block mapping when
rodata=full
On 05/08/2025 19:53, Yang Shi wrote:
[...]
>>> + arch_enter_lazy_mmu_mode();
>>> + ret = split_pgd(pgd_offset_k(start), start, end);
>> My instinct still remains that it would be better not to iterate over the range
>> here, but instead call a "split(start); split(end);" since we just want to split
>> the start and end. So the code would be simpler and probably more performant if
>> we get rid of all the iteration.
>
> It should be more performant for splitting large range, especially the range
> includes leaf mappings at different levels. But I had some optimization to skip
> leaf mappings in this version, so it should be close to your implementation from
> performance perspective. And it just walks the page table once instead of twice.
> It should be more efficient for small split, for example, 4K.
I guess this is the crux of our disagreement. I think the "walks the table once
for 4K" is a micro optimization, which I doubt we would see on any benchmark
results. In the absence of data, I'd prefer the simpler, smaller, easier to
understand version.
Both implementations are on list now; perhaps the maintainers can steer us.
Thanks,
Ryan
Powered by blists - more mailing lists