[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070513152019.GA25236@Krystal>
Date: Sun, 13 May 2007 11:20:20 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Alan Cox <alan@...rguk.ukuu.org.uk>
Cc: Andi Kleen <ak@....de>, systemtap@...rces.redhat.com,
prasanna@...ibm.com, ananth@...ibm.com,
anil.s.keshavamurthy@...el.com, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, hch@...radead.org
Subject: Re: [patch 05/10] Linux Kernel Markers - i386 optimized version
* Alan Cox (alan@...rguk.ukuu.org.uk) wrote:
> > The IPI might be fast, but I have seen interrupts being disabled for
> > quite a long time in some kernel code paths. Having interrupts disabled
> > on _each cpu_ while running an IPI handler waiting to be synchronized
> > with other CPUs has this side-effect. Therefore, if I understand well,
>
> This can already occur worst case when we spin on an IPI (eg a cross CPU
> TLB shootdown)
>
Hrm, maybe am I understanding something incorrectly there :
arch/i386/kernel/smp.c: native_flush_tlb_others() takes a spinlock, but
does not disable interrupts, while spinning waiting for other CPUs.
smp_invalidate_interrupt(), in the same file, does not spin waiting for
other CPUs. Therefore, I understand that none of these functions spin
with interrupts disabled, so this TLB flush does not show the same
behavior.
> If the INT3 is acknowledged as safe by intel either as is or by some
> specific usage like lock mov then great. If not it isn't too bad a
> problem.
>
Another mail in this thread explains that the main issue is not the
atomicity of the code modification operation (although it must be atomic
for the CPU to see a correct instruction), but to the fact that the CPU
expects the pre-fetched instruction and the executed instruction to be
the same, except for the int3 case.
> And to be real about this - how many benchmarks do you know that care
> about mega-kernel-debugs per second ?
For users with real-time needs, the overall IRQ latency of the system
gives an upper-bound to what can be executed by the application in a
given time-frame. People doing audio/video acquisition should be quite
interested in this metric.
So this is mostly a matter of how this action (enabling a marker) can
influence the overall system's latency. One of my goals is to provide
tracing in the Linux kernel with minimal performance and behavioral
impact on the system so it does not make the system flakyer than normal
and can be activated on a bogus system and still reproduce the original
problem.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists