[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190219151532.GA40581@gmail.com>
Date: Tue, 19 Feb 2019 16:15:32 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Paul Turner <pjt@...gle.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
subhra.mazumdar@...cle.com,
Frédéric Weisbecker <fweisbec@...il.com>,
Kees Cook <keescook@...omium.org>, kerrnel@...gle.com
Subject: Re: [RFC][PATCH 00/16] sched: Core scheduling
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Mon, Feb 18, 2019 at 12:40 PM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > If there were close to no VMEXITs, it beat smt=off, if there were lots
> > of VMEXITs it was far far worse. Supposedly hosting people try their
> > very bestest to have no VMEXITs so it mostly works for them (with the
> > obvious exception of single VCPU guests).
> >
> > It's just that people have been bugging me for this crap; and I figure
> > I'd post it now that it's not exploding anymore and let others have at.
>
> The patches didn't look disgusting to me, but I admittedly just
> scanned through them quickly.
>
> Are there downsides (maintenance and/or performance) when core
> scheduling _isn't_ enabled? I guess if it's not a maintenance or
> performance nightmare when off, it's ok to just give people the
> option.
So this bit is the main straight-line performance impact when the
CONFIG_SCHED_CORE Kconfig feature is present (which I expect distros to
enable broadly):
+static inline bool sched_core_enabled(struct rq *rq)
+{
+ return static_branch_unlikely(&__sched_core_enabled) && rq->core_enabled;
+}
static inline raw_spinlock_t *rq_lockp(struct rq *rq)
{
+ if (sched_core_enabled(rq))
+ return &rq->core->__lock
+
return &rq->__lock;
This should at least in principe keep the runtime overhead down to more
NOPs and a bit bigger instruction cache footprint - modulo compiler
shenanigans.
Here's the code generation impact on x86-64 defconfig:
text data bss dec hex filename
228 48 0 276 114 sched.core.n/cpufreq.o (ex sched.core.n/built-in.a)
228 48 0 276 114 sched.core.y/cpufreq.o (ex sched.core.y/built-in.a)
4438 96 0 4534 11b6 sched.core.n/completion.o (ex sched.core.n/built-in.a)
4438 96 0 4534 11b6 sched.core.y/completion.o (ex sched.core.y/built-in.a)
2167 2428 0 4595 11f3 sched.core.n/cpuacct.o (ex sched.core.n/built-in.a)
2167 2428 0 4595 11f3 sched.core.y/cpuacct.o (ex sched.core.y/built-in.a)
61099 22114 488 83701 146f5 sched.core.n/core.o (ex sched.core.n/built-in.a)
70541 25370 508 96419 178a3 sched.core.y/core.o (ex sched.core.y/built-in.a)
3262 6272 0 9534 253e sched.core.n/wait_bit.o (ex sched.core.n/built-in.a)
3262 6272 0 9534 253e sched.core.y/wait_bit.o (ex sched.core.y/built-in.a)
12235 341 96 12672 3180 sched.core.n/rt.o (ex sched.core.n/built-in.a)
13073 917 96 14086 3706 sched.core.y/rt.o (ex sched.core.y/built-in.a)
10293 477 1928 12698 319a sched.core.n/topology.o (ex sched.core.n/built-in.a)
10363 509 1928 12800 3200 sched.core.y/topology.o (ex sched.core.y/built-in.a)
886 24 0 910 38e sched.core.n/cpupri.o (ex sched.core.n/built-in.a)
886 24 0 910 38e sched.core.y/cpupri.o (ex sched.core.y/built-in.a)
1061 64 0 1125 465 sched.core.n/stop_task.o (ex sched.core.n/built-in.a)
1077 128 0 1205 4b5 sched.core.y/stop_task.o (ex sched.core.y/built-in.a)
18443 365 24 18832 4990 sched.core.n/deadline.o (ex sched.core.n/built-in.a)
20019 2189 24 22232 56d8 sched.core.y/deadline.o (ex sched.core.y/built-in.a)
1123 8 64 1195 4ab sched.core.n/loadavg.o (ex sched.core.n/built-in.a)
1123 8 64 1195 4ab sched.core.y/loadavg.o (ex sched.core.y/built-in.a)
1323 8 0 1331 533 sched.core.n/stats.o (ex sched.core.n/built-in.a)
1323 8 0 1331 533 sched.core.y/stats.o (ex sched.core.y/built-in.a)
1282 164 32 1478 5c6 sched.core.n/isolation.o (ex sched.core.n/built-in.a)
1282 164 32 1478 5c6 sched.core.y/isolation.o (ex sched.core.y/built-in.a)
1564 36 0 1600 640 sched.core.n/cpudeadline.o (ex sched.core.n/built-in.a)
1564 36 0 1600 640 sched.core.y/cpudeadline.o (ex sched.core.y/built-in.a)
1640 56 0 1696 6a0 sched.core.n/swait.o (ex sched.core.n/built-in.a)
1640 56 0 1696 6a0 sched.core.y/swait.o (ex sched.core.y/built-in.a)
1859 244 32 2135 857 sched.core.n/clock.o (ex sched.core.n/built-in.a)
1859 244 32 2135 857 sched.core.y/clock.o (ex sched.core.y/built-in.a)
2339 8 0 2347 92b sched.core.n/cputime.o (ex sched.core.n/built-in.a)
2339 8 0 2347 92b sched.core.y/cputime.o (ex sched.core.y/built-in.a)
3014 32 0 3046 be6 sched.core.n/membarrier.o (ex sched.core.n/built-in.a)
3014 32 0 3046 be6 sched.core.y/membarrier.o (ex sched.core.y/built-in.a)
50027 964 96 51087 c78f sched.core.n/fair.o (ex sched.core.n/built-in.a)
51537 2484 96 54117 d365 sched.core.y/fair.o (ex sched.core.y/built-in.a)
3192 220 0 3412 d54 sched.core.n/idle.o (ex sched.core.n/built-in.a)
3276 252 0 3528 dc8 sched.core.y/idle.o (ex sched.core.y/built-in.a)
3633 0 0 3633 e31 sched.core.n/pelt.o (ex sched.core.n/built-in.a)
3633 0 0 3633 e31 sched.core.y/pelt.o (ex sched.core.y/built-in.a)
3794 160 0 3954 f72 sched.core.n/wait.o (ex sched.core.n/built-in.a)
3794 160 0 3954 f72 sched.core.y/wait.o (ex sched.core.y/built-in.a)
I'd say this one is representative:
text data bss dec hex filename
12235 341 96 12672 3180 sched.core.n/rt.o (ex sched.core.n/built-in.a)
13073 917 96 14086 3706 sched.core.y/rt.o (ex sched.core.y/built-in.a)
which ~6% bloat is primarily due to the higher rq-lock inlining overhead,
I believe.
This is roughly what you'd expect from a change wrapping all 350+ inlined
instantiations of rq->lock uses. I.e. it might make sense to uninline it.
In terms of long term maintenance overhead, ignoring the overhead of the
core-scheduling feature itself, the rq-lock wrappery is the biggest
ugliness, the rest is mostly isolated.
So if this actually *works* and improves the performance of some real
VMEXIT-poor SMT workloads and allows the enabling of HyperThreading with
untrusted VMs without inviting thousands of guest roots then I'm
cautiously in support of it.
> That all assumes that it works at all for the people who are clamoring
> for this feature, but I guess they can run some loads on it eventually.
> It's a holiday in the US right now ("Presidents' Day"), but maybe we
> can get some numebrs this week?
Such numbers would be *very* helpful indeed.
Thanks,
Ingo
Powered by blists - more mailing lists