lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 22 Mar 2014 15:15:40 +0800 From: lwcheng@...hku.hk To: Rik van Riel <riel@...hat.com> Cc: Glauber Costa <glommer@...il.com>, Peter Zijlstra <a.p.zijlstra@...llo.nl>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [BUG] Paravirtual time accounting / IRQ time accounting Quoting Rik van Riel <riel@...hat.com>: > On 03/20/2014 11:01 AM, Glauber Costa wrote: >> On Wed, Mar 19, 2014 at 6:42 AM, <lwcheng@...hku.hk> wrote: > >>> ------------ >>> [src/kernel/sched/core.c] >>> static void update_rq_clock_task(struct rq *rq, s64 delta) >>> { >>> ... ... >>> #ifdef CONFIG_IRQ_TIME_ACCOUNTING >>> irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time; >>> ... ... >>> rq->prev_irq_time += irq_delta; >>> delta -= irq_delta; >>> #endif >>> >>> #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING >>> if (static_key_false((¶virt_steal_rq_enabled))) { >>> steal = paravirt_steal_clock(cpu_of(rq)); >>> steal -= rq->prev_steal_time_rq; >>> ... ... >>> rq->prev_steal_time_rq += steal; >>> delta -= steal; >>> } >>> #endif >>> >>> rq->clock_task += delta; >>> ... ... >>> } >>> -- >>> "delta" -> the intended increment to rq->clock_task >>> "irq_delta" -> the time spent on serving IRQ (hard + soft) >>> "steal" -> the time stolen by the underlying hypervisor >>> -- >>> "irq_delta" is calculated based on sched_clock_cpu(), which is vulnerable >>> to VM scheduling delays. >> >> This looks like a real problem indeed. The main problem in searching >> for a solution, is that of course not all of the irq time is steal >> time and vice versa. In this case, we could subtract irq_time from >> steal, and add only the steal part time that is in excess. I don't >> think this is 100 % guaranteed, but maybe it is a good approximation. >> >> Rik, do you have an opinion on this ? > > The other way around may be better, since steal time (when it > happens) is likely to be of "time slice" magnitude, while irq > time will happen more frequently, and in dozens-of-microseconds > intervals. > > Furthermore, we have no way to measure what the irq time is, > except by looking at how much real time elapsed. For steal time, > however, the hypervisor tells us exactly how much time was stolen. > > That means when steal time and irq time happen simultaneously, > the amount of steal time should always be smaller than the > calculated irq time for that period. > > actual irq_time = calculated irq time - reported steal time; > > -- > All rights reversed > I observe that sometimes irq_time only includes "part" of steal_time. Like you said, irq_time is in dozens-of-microseconds. In VMs, as all devices seen are virtual ones, irq_time seems to be not as desired as it is in physical hosts. A quick (but not radical) solution may be: disable CONFIG_IRQ_TIME_ACCOUNTING if CONFIG_PARAVIRT is selected. Just adopt tick-based accounting: CONFIG_TICK_CPU_ACCOUNTING I am thinking what irq_time really *means* in VMs. -Luwei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists