[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140319174222.213976eu8b54ujk0@intranet.cs.hku.hk>
Date: Wed, 19 Mar 2014 17:42:22 +0800
From: lwcheng@...hku.hk
To: linux-kernel@...r.kernel.org
Cc: glommer@...il.com
Subject: [BUG] Paravirtual time accounting / IRQ time accounting
In consolidated environments, when there are multiple virtual machines (VMs)
running on one CPU core, timekeeping will be a problem to the guest OS.
Here, I report my findings about Linux process scheduler.
Description
------------
Linux CFS relies on rq->clock_task to charge each task, determine
vruntime, etc.
When CONFIG_IRQ_TIME_ACCOUNTING is enabled, the time spent on serving IRQ
will be excluded from updating rq->clock_task.
When CONFIG_PARAVIRT_TIME_ACCOUNTING is enabled, the time stolen by
the hypervisor
will also be excluded from updating rq->clock_task.
With "both" CONFIG_IRQ_TIME_ACCOUNTING and
CONFIG_PARAVIRT_TIME_ACCOUNTING enabled,
I put three KVM guests on one core and run hackbench in each guest. I
find that
in the guests, rq->clock_task stays *unchanged*. The malfunction
embarrasses CFS.
------------
Analysis
------------
[src/kernel/sched/core.c]
static void update_rq_clock_task(struct rq *rq, s64 delta)
{
... ...
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
... ...
rq->prev_irq_time += irq_delta;
delta -= irq_delta;
#endif
#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
if (static_key_false((¶virt_steal_rq_enabled))) {
steal = paravirt_steal_clock(cpu_of(rq));
steal -= rq->prev_steal_time_rq;
... ...
rq->prev_steal_time_rq += steal;
delta -= steal;
}
#endif
rq->clock_task += delta;
... ...
}
--
"delta" -> the intended increment to rq->clock_task
"irq_delta" -> the time spent on serving IRQ (hard + soft)
"steal" -> the time stolen by the underlying hypervisor
--
"irq_delta" is calculated based on sched_clock_cpu(), which is vulnerable
to VM scheduling delays. "irq_delta" can include part or whole of "steal".
I observe that [irq_delta + steal >> delta].
As a result, "delta" becomes zero. That is why rq->clock_task stops.
------------
Please confirm this bug. Thanks.
Luwei Cheng
--
CS student
The University of Hong Kong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists