[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251115150928.649-1-qq570070308@gmail.com>
Date: Sat, 15 Nov 2025 23:09:28 +0800
From: Xie Yuanbin <qq570070308@...il.com>
To: tglx@...utronix.de
Cc: aalbersh@...hat.com,
acme@...nel.org,
adrian.hunter@...el.com,
agordeev@...ux.ibm.com,
akpm@...ux-foundation.org,
alex@...ti.fr,
alexander.shishkin@...ux.intel.com,
andreas@...sler.com,
andrii@...nel.org,
anna-maria@...utronix.de,
aou@...s.berkeley.edu,
arnd@...db.de,
baolin.wang@...ux.alibaba.com,
borntraeger@...ux.ibm.com,
bp@...en8.de,
brauner@...nel.org,
bsegall@...gle.com,
dave.hansen@...ux.intel.com,
davem@...emloft.net,
david@...nel.org,
david@...hat.com,
dietmar.eggemann@....com,
frederic@...nel.org,
gor@...ux.ibm.com,
hca@...ux.ibm.com,
hpa@...or.com,
irogers@...gle.com,
james.clark@...aro.org,
jlayton@...nel.org,
jolsa@...nel.org,
juri.lelli@...hat.com,
justinstitt@...gle.com,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
linux-riscv@...ts.infradead.org,
linux-s390@...r.kernel.org,
linux@...linux.org.uk,
llvm@...ts.linux.dev,
lorenzo.stoakes@...cle.com,
luto@...nel.org,
mark.rutland@....com,
mathieu.desnoyers@...icios.com,
max.kellermann@...os.com,
mgorman@...e.de,
mhiramat@...nel.org,
mingo@...hat.com,
morbo@...gle.com,
namhyung@...nel.org,
nathan@...nel.org,
nick.desaulniers+lkml@...il.com,
nysal@...ux.ibm.com,
oleg@...hat.com,
osalvador@...e.de,
palmer@...belt.com,
paulmck@...nel.org,
peterz@...radead.org,
pjw@...nel.org,
qq570070308@...il.com,
riel@...riel.com,
rostedt@...dmis.org,
ryan.roberts@....com,
segher@...nel.crashing.org,
sforshee@...nel.org,
sparclinux@...r.kernel.org,
svens@...ux.ibm.com,
thuth@...hat.com,
urezki@...il.com,
vincent.guittot@...aro.org,
vschneid@...hat.com,
will@...nel.org,
x86@...nel.org
Subject: Re: [PATCH v3 3/3] Make finish_task_switch and its subfuncs inline in context switching
On Fri, 14 Nov 2025 21:00:43 +0100, Thomas Gleixner wrote:
> What are subfuncs? This is not a SMS service. Use proper words and not
> made up abbreviations.
>
> Again you mark them __always_inline and not inline. Most of them are
> already 'inline'. Can you please precise in your wording?
>
> This really can go into the comment section below the first '---'
> separator. No point in having this in the change log.
Thanks for pointing it out, I will improve it in v4 patch.
>> After `finish_task_switch` is changed to an inline function, the number of
>> calls to the subfunctions (called by `finish_task_switch`) increases in
>> this translation unit due to the inline expansion of `finish_task_switch`.
>> Due to compiler optimization strategies, these functions may transition
>> from inline functions to non inline functions, which can actually lead to
>> performance degradation.
>
> I'm having a hard time to understand this word salad.
I think the description is very important here, because it explains why
it needs to make the subfunctions as __always_inline.
Where is difficult to understand specifically? Please point it out,
and I will improve the description in v4 patch. Thank you very much!
> What means (rdtsc)?
This is a high-precision timestamp acquisition method in x86.
The description here is not sufficient, thanks for pointing it out, I
will improve it in v4 patch.
> So the real benefit is observable when spectre_v2_user mitigations are
> enabled. You completely fail to explain that.
What kind of explanation is needed here?
```txt
When spectre_v2_user mitigation is enabled, kernel is likely to
preform branch prediction hardening inside switch_mm_irq_off, which can
drastically increase the branch prediction misses in subsequently
executed code.
On x86, this mitigation is enabled conditionally by default, but on other
architectures, for example arm32/aarch64, the mitigation may be fully
enabled by default.
`finish_task_switch` is right after `switch_mm_irq_off`, so makeing it
inline can achieve high performance benefits.
```
Is it ok? Thanks very much!
> bzImage size is completely irrelevant. What's interesting is how the
> size of the actual function changes.
I think the bzImage size is meaningful, at least for many embedded
devices. Due to compression algorithms, code size cannot directly reflect
to the compressed size.
Anyway, I will supplement the size of the .text section in the v4 patch.
Thanks very much!
Xie Yuanbin
Powered by blists - more mailing lists