[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55310CF2.6070107@redhat.com>
Date: Fri, 17 Apr 2015 15:38:58 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
gleb@...nel.org, kvm@...r.kernel.org,
Ralf Baechle <ralf@...ux-mips.org>, mtosatti@...hat.com,
luto@...nel.org
Subject: Re: [GIT PULL] First batch of KVM changes for 4.1
On 17/04/2015 15:10, Peter Zijlstra wrote:
> On Fri, Apr 17, 2015 at 02:46:57PM +0200, Paolo Bonzini wrote:
>> On 17/04/2015 12:55, Peter Zijlstra wrote:
>>> Also, it looks like you already do exactly this for other things, look
>>> at:
>>>
>>> kvm_sched_in()
>>> kvm_arch_vcpu_load()
>>> if (unlikely(vcpu->cpu != cpu) ... )
>>>
>>> So no, I don't believe for one second you need this.
>
> This [...] brings us back to where we were last
> time. There is _0_ justification for this in the patches, that alone is
> grounds enough to reject it.
Oh, we totally agree on that. I didn't commit that patch, but I already
said the commit message was insufficient.
> Why should the guest task care about the physical cpu of the vcpu;
> that's a layering fail if ever there was one.
It's totally within your right to not read the code, but then please
don't try commenting at it.
This code:
kvm_sched_in()
kvm_arch_vcpu_load()
if (unlikely(vcpu->cpu != cpu) ... )
runs in the host. The hypervisor obviously cares if the physical CPU of
the VCPU changes. It has to tell the source processor (vcpu->cpu) to
release the VCPU's data structure and only then it can use it in the
target processor (cpu). No layering violation here.
The task migration notifier runs in the guest, whenever the VCPU of
a task changes.
> Furthermore, the only thing that migration handler seems to do is
> increment a variable that is not actually used in that file.
It's used in the vDSO, so you cannot increment it in the file that uses it.
>> And frankly, I think the static key is snake oil. The cost of task
>> migration in terms of cache misses and TLB misses is in no way
>> comparable to the cost of filling in a structure on the stack,
>> dereferencing the head of the notifiers list and seeing that it's NULL.
>
> The path this notifier is called from has nothing to do with those
> costs.
How not? The task is going to incur those costs, it's not like half
a dozen extra instruction make any difference. But anyway...
> And the fact you're inflicting these costs on _everyone_ for a
> single x86_64-paravirt case is insane.
... that's a valid objection. Please look at the patch below.
> I've had enough of this, the below goes into sched/urgent and you can
> come back with sane patches if and when you're ready.
Oh, please, cut the alpha male crap.
Paolo
------------------- 8< ----------------
>From 4eb9d7132e1990c0586f28af3103675416d38974 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@...hat.com>
Date: Fri, 17 Apr 2015 14:57:34 +0200
Subject: [PATCH] sched: add CONFIG_TASK_MIGRATION_NOTIFIER
The task migration notifier is only used in x86 paravirt. Make it
possible to compile it out.
While at it, move some code around to ensure tmn is filled from CPU
registers.
Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
---
arch/x86/Kconfig | 1 +
init/Kconfig | 3 +++
kernel/sched/core.c | 9 ++++++++-
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d43e7e1c784b..9af252c8698d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -649,6 +649,7 @@ if HYPERVISOR_GUEST
config PARAVIRT
bool "Enable paravirtualization code"
+ select TASK_MIGRATION_NOTIFIER
---help---
This changes the kernel so it can modify itself when it is run
under a hypervisor, potentially improving performance significantly
diff --git a/init/Kconfig b/init/Kconfig
index 3b9df1aa35db..891917123338 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2016,6 +2016,9 @@ source "block/Kconfig"
config PREEMPT_NOTIFIERS
bool
+config TASK_MIGRATION_NOTIFIER
+ bool
+
config PADATA
depends on SMP
bool
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9123a82cbb6..c07a53aa543c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1016,12 +1016,14 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags)
rq_clock_skip_update(rq, true);
}
+#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
static ATOMIC_NOTIFIER_HEAD(task_migration_notifier);
void register_task_migration_notifier(struct notifier_block *n)
{
atomic_notifier_chain_register(&task_migration_notifier, n);
}
+#endif
#ifdef CONFIG_SMP
void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
@@ -1053,18 +1055,23 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
trace_sched_migrate_task(p, new_cpu);
if (task_cpu(p) != new_cpu) {
+#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
struct task_migration_notifier tmn;
+ int from_cpu = task_cpu(p);
+#endif
if (p->sched_class->migrate_task_rq)
p->sched_class->migrate_task_rq(p, new_cpu);
p->se.nr_migrations++;
perf_sw_event_sched(PERF_COUNT_SW_CPU_MIGRATIONS, 1, 0);
+#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
tmn.task = p;
- tmn.from_cpu = task_cpu(p);
+ tmn.from_cpu = from_cpu;
tmn.to_cpu = new_cpu;
atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn);
+#endif
}
__set_task_cpu(p, new_cpu);
--
2.3.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists