[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0810161701470.3288@nehalem.linux-foundation.org>
Date: Thu, 16 Oct 2008 17:08:59 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>,
David Miller <davem@...emloft.net>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [RFC patch 15/15] LTTng timestamp x86
On Thu, 16 Oct 2008, Mathieu Desnoyers wrote:
>
> +static inline cycles_t ltt_async_tsc_read(void)
(a) this shouldn't be inline
> + rdtsc_barrier();
> + new_tsc = get_cycles();
> + rdtsc_barrier();
> + do {
> + last_tsc = ltt_last_tsc;
> + if (new_tsc < last_tsc)
> + new_tsc = last_tsc + LTT_MIN_PROBE_DURATION;
> + /*
> + * If cmpxchg fails with a value higher than the new_tsc, don't
> + * retry : the value has been incremented and the events
> + * happened almost at the same time.
> + * We must retry if cmpxchg fails with a lower value :
> + * it means that we are the CPU with highest frequency and
> + * therefore MUST update the value.
> + */
> + } while (cmpxchg64(<t_last_tsc, last_tsc, new_tsc) < new_tsc);
(b) This is really quite expensive.
Why do things like this? Make the timestamps be per-cpu. If you do things
like the above, then just getting the timestamp means that every single
trace event will cause a cacheline bounce, and if you do that, you might
as well just not have per-cpu tracing at all.
It really boils down to two cases:
- you do per-CPU traces
If so, you need to ONLY EVER touch per-cpu data when tracing, and the
above is a fundamental BUG. Dirtying shared cachelines makes the whole
per-cpu thing pointless.
- you do global traces
Sure, then the above works, but why bother? You'll get the ordering
from the global trace, you might as well do time stamps with local
counts.
So in neither case does it make any sense to try to do that global
ltt_last_tsc.
Perhaps more importantly - if the TSC really are out of whack, that just
means that now all your timestamps are worthless, because the value you
calculate ends up having NOTHING to do with the timestamp. So you cannot
even use it to see how long something took, because it may be that you're
running on the CPU that runs behind, and all you ever see is the value of
LTT_MIN_PROBE_DURATION.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists