[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87346gbd04.ffs@tglx>
Date: Fri, 14 Nov 2025 21:00:43 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Xie Yuanbin <qq570070308@...il.com>, riel@...riel.com,
segher@...nel.crashing.org, david@...hat.com, peterz@...radead.org,
hpa@...or.com, osalvador@...e.de, linux@...linux.org.uk,
mathieu.desnoyers@...icios.com, paulmck@...nel.org, pjw@...nel.org,
palmer@...belt.com, aou@...s.berkeley.edu, alex@...ti.fr,
hca@...ux.ibm.com, gor@...ux.ibm.com, agordeev@...ux.ibm.com,
borntraeger@...ux.ibm.com, svens@...ux.ibm.com, davem@...emloft.net,
andreas@...sler.com, luto@...nel.org, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, acme@...nel.org, namhyung@...nel.org,
mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...nel.org, irogers@...gle.com, adrian.hunter@...el.com,
james.clark@...aro.org, anna-maria@...utronix.de, frederic@...nel.org,
juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, nathan@...nel.org,
nick.desaulniers+lkml@...il.com, morbo@...gle.com, justinstitt@...gle.com,
qq570070308@...il.com, thuth@...hat.com, brauner@...nel.org,
arnd@...db.de, sforshee@...nel.org, mhiramat@...nel.org,
andrii@...nel.org, oleg@...hat.com, jlayton@...nel.org,
aalbersh@...hat.com, akpm@...ux-foundation.org, david@...nel.org,
lorenzo.stoakes@...cle.com, baolin.wang@...ux.alibaba.com,
max.kellermann@...os.com, ryan.roberts@....com, nysal@...ux.ibm.com,
urezki@...il.com
Cc: x86@...nel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org,
linux-s390@...r.kernel.org, sparclinux@...r.kernel.org,
linux-perf-users@...r.kernel.org, llvm@...ts.linux.dev, will@...nel.org
Subject: Re: [PATCH v3 3/3] Make finish_task_switch and its subfuncs inline
in context switching
On Thu, Nov 13 2025 at 18:52, Xie Yuanbin wrote:
What are subfuncs? This is not a SMS service. Use proper words and not
made up abbreviations.
> `finish_task_switch` is a hot path in context switching, and due to
Same comment as before about functions....
> possible mitigations inside switch_mm, performance here is greatly
> affected by function calls and branch jumps. Make it inline to optimize
> the performance.
Again you mark them __always_inline and not inline. Most of them are
already 'inline'. Can you please precise in your wording?
> After `finish_task_switch` is changed to an inline function, the number of
> calls to the subfunctions (called by `finish_task_switch`) increases in
> this translation unit due to the inline expansion of `finish_task_switch`.
> Due to compiler optimization strategies, these functions may transition
> from inline functions to non inline functions, which can actually lead to
> performance degradation.
I'm having a hard time to understand this word salad.
> Make the subfunctions of finish_task_stwitch inline to prevent
> degradation.
>
> Perf test:
> Time spent on calling finish_task_switch (rdtsc):
What means (rdtsc)?
> | compiler && appended cmdline | without patch | with patch |
> | gcc + NA | 13.93 - 13.94 | 12.39 - 12.44 |
What is NA and what are the time units of this?
> | gcc + "spectre_v2_user=on" | 24.69 - 24.85 | 13.68 - 13.73 |
> | clang + NA | 13.89 - 13.90 | 12.70 - 12.73 |
> | clang + "spectre_v2_user=on" | 29.00 - 29.02 | 18.88 - 18.97 |
So the real benefit is observable when spectre_v2_user mitigations are
enabled. You completely fail to explain that.
> Perf test info:
> 1. kernel source:
> linux-next
> commit 9c0826a5d9aa4d52206d ("Add linux-next specific files for 20251107")
> 2. compiler:
> gcc: gcc version 15.2.0 (Debian 15.2.0-7) with
> GNU ld (GNU Binutils for Debian) 2.45
> clang: Debian clang version 21.1.4 (8) with
> Debian LLD 21.1.4 (compatible with GNU linkers)
> 3. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_HZ=100
> CONFIG_DEBUG_ENTRY=n
> CONFIG_X86_DEBUG_FPU=n
> CONFIG_EXPERT=y
> CONFIG_MODIFY_LDT_SYSCALL=n
> CONFIG_CGROUPS=n
> CONFIG_BLK_DEV_NVME=y
This really can go into the comment section below the first '---'
separator. No point in having this in the change log.
> Size test:
> bzImage size:
> | compiler | without patches | with patches |
> | clang | 13722624 | 13722624 |
> | gcc | 12596224 | 12596224 |
bzImage size is completely irrelevant. What's interesting is how the
size of the actual function changes.
> Size test info:
> 1. kernel source && compiler: same as above
> 2. config:
> base on default x86_64_defconfig, and setting:
> CONFIG_SCHED_CORE=y
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_NO_HZ_FULL=y
And again, we all know how to build a kernel.
Powered by blists - more mailing lists