[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251130124304.184107-1-qq570070308@gmail.com>
Date: Sun, 30 Nov 2025 20:43:04 +0800
From: Xie Yuanbin <qq570070308@...il.com>
To: acme@...nel.org,
adrian.hunter@...el.com,
agordeev@...ux.ibm.com,
akpm@...ux-foundation.org,
alex@...ti.fr,
alexander.shishkin@...ux.intel.com,
andreas@...sler.com,
anna-maria@...utronix.de,
anshuman.khandual@....com,
aou@...s.berkeley.edu,
arnd@...db.de,
borntraeger@...ux.ibm.com,
bp@...en8.de,
bsegall@...gle.com,
dave.hansen@...ux.intel.com,
davem@...emloft.net,
david@...nel.org,
dietmar.eggemann@....com,
frederic@...nel.org,
gor@...ux.ibm.com,
hca@...ux.ibm.com,
hpa@...or.com,
irogers@...gle.com,
james.clark@...aro.org,
jolsa@...nel.org,
juri.lelli@...hat.com,
justinstitt@...gle.com,
lorenzo.stoakes@...cle.com,
luto@...nel.org,
mark.rutland@....com,
mathieu.desnoyers@...icios.com,
max.kellermann@...os.com,
mgorman@...e.de,
mingo@...hat.com,
morbo@...gle.com,
namhyung@...nel.org,
nathan@...nel.org,
nick.desaulniers+lkml@...il.com,
nysal@...ux.ibm.com,
palmer@...belt.com,
paulmck@...nel.org,
peterz@...radead.org,
pjw@...nel.org,
riel@...riel.com,
rostedt@...dmis.org,
ryan.roberts@....com,
segher@...nel.crashing.org,
svens@...ux.ibm.com,
tglx@...utronix.de,
thuth@...hat.com,
urezki@...il.com,
vincent.guittot@...aro.org,
vschneid@...hat.com,
linux@...linux.org.uk
Cc: linux-kernel@...r.kernel.org,
x86@...nel.org,
linux-arm-kernel@...ts.infradead.org,
linux-riscv@...ts.infradead.org,
linux-s390@...r.kernel.org,
sparclinux@...r.kernel.org,
linux-perf-users@...r.kernel.org,
llvm@...ts.linux.dev
Subject: Re: [PATCH v4 0/3] Optimize code generation during context switching
On Sun, 23 Nov 2025 20:18:24 +0800, Xie Yuanbin wrote:
> This series of patches primarily make some functions called in context
> switching as always inline to optimize performance. Here is the
> performance test data for these patches:
> Time spent on calling finish_task_switch(), the unit is tsc from x86:
> | test scenario | old | new | delta |
> | gcc 15.2 | 13.94 | 12.40 | 1.54 (-11.1%) |
> | gcc 15.2 + spectre_v2 | 24.78 | 13.70 | 11.08 (-44.7%) |
> | clang 21.1.4 | 13.90 | 12.71 | 1.19 (- 8.6%) |
> | clang 21.1.4 + spectre_v2 | 29.01 | 18.91 | 10.1 (-34.8%) |
Hi everyone, I also conducted a performance test on raspberry pi 3b. I
hope this will be helpful in merging the patch.
The following is the test data:
Time spent on calling finish_task_switch(), the clocksource and unit is
cntvct_el0 from aarch64:
| test scenario | old | new | delta |
| gcc 15.2 | 2.00 | 1.68 | 0.32 (-16.0%) |
| clang 21.1.6 | 2.15 | 1.68 | 0.47 (-23.5%) |
Since raspberry pi 3b use a cortex-a53 processor, it is not affected by
the spectre v2 vulnerability, as is defined in
arch/arm64/kernel/proton-pack.c:
```c
static const struct midr_range spectre_v2_safe_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_2XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
{ /* sentinel */ }
};
```
Link: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/arm64/kernel/proton-pack.c?id=7d31f578f3230f3b7b33b0930b08f9afd8429817#n152
Perhaps I can test the performace with spectre_v2 mitigation enabled on
a raspberry pi 4b in the future.
In order to make the test result stable, I fixed the cpu frequency by
setting config.txt as following:
```config
core_freq_fixed=1
arm_freq=800
arm_freq_min=800
gpu_freq=300
core_freq=300
h264_freq=300
isp_freq=300
v3d_freq=300
hevc_freq=300
sdram_freq=400
gpu_freq_min=300
core_freq_min=300
h264_freq_min=300
isp_freq_min=300
v3d_freq_min=300
hevc_freq_min=300
sdram_freq_min=400
```
The test source is commit 7d31f578f323 ("Add linux-next specific files
for 20251128") from liunx-next branch. Using default defconfig config,
and setting:
CONFIG_ARM64_SVE=n
CONFIG_COMPAT=n
CONFIG_COMPAT_32BIT_TIME=n
CONFIG_ARM64_PTR_AUTH=n
CONFIG_ARM64_GCS=n
CONFIG_ARM64_MTE=n
CONFIG_SHADOW_CALL_STACK=y
CONFIG_SCHED_AUTOGROUP=n
CONFIG_CGROUPS=n
CONFIG_KVM=n
CONFIG_HZ_100=y
CONFIG_HZ=100
Thanks very much!
Xie Yuanbin
Powered by blists - more mailing lists