[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250806055111.1519608-1-liqiang01@kylinos.cn>
Date: Wed, 6 Aug 2025 13:51:11 +0800
From: Li Qiang <liqiang01@...inos.cn>
To: akpm@...ux-foundation.org,
david@...hat.com
Cc: linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com,
vbabka@...e.cz,
rppt@...nel.org,
surenb@...gle.com,
mhocko@...e.com
Subject: Re: [PATCH] mm: memory: Force-inline PTE/PMD zapping functions for performance
Tue, 5 Aug 2025 14:35:22, Lorenzo Stoakes wrote:
> I'm not sure, actual workloads would be best but presumably you don't have
> one where you've noticed a demonstrable difference otherwise you'd have
> mentioned...
>
> At any rate I've come around on this series, and think this is probably
> reasonable, but I would like to see what increasing max-inline-insns-single
> does first?
Thank you for your suggestions. I'll pay closer attention
to email formatting in future communications.
Regarding the performance tests on x86_64 architecture:
Parameter Observation:
When setting max-inline-insns-single=400 (matching arm64's
default value) without applying my patch, the compiler
automatically inlines the critical functions.
Benchmark Results:
Configuration Baseline With Patch max-inline-insns-single=400
UnixBench Score 1824 1835 (+0.6%) 1840 (+0.9%)
vmlinux Size (bytes) 35,379,608 35,379,786 (+0.005%) 35,529,641 (+0.4%)
Key Findings:
The patch provides significant performance gain (0.6%) with
minimal size impact (0.005% increase). While
max-inline-insns-single=400 yields slightly better
performance (0.9%), it incurs a larger size penalty (0.4% increase).
Conclusion:
The patch achieves a better performance/size trade-off
compared to globally adjusting the inline threshold. The targeted
approach (selective __always_inline) appears more efficient for
this specific optimization.
Powered by blists - more mailing lists