lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 16 Oct 2008 17:08:59 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	linux-arch@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Thomas Gleixner <tglx@...utronix.de>,
	David Miller <davem@...emloft.net>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [RFC patch 15/15] LTTng timestamp x86



On Thu, 16 Oct 2008, Mathieu Desnoyers wrote:
>
> +static inline cycles_t ltt_async_tsc_read(void)

(a) this shouldn't be inline

> +	rdtsc_barrier();
> +	new_tsc = get_cycles();
> +	rdtsc_barrier();
> +	do {
> +		last_tsc = ltt_last_tsc;
> +		if (new_tsc < last_tsc)
> +			new_tsc = last_tsc + LTT_MIN_PROBE_DURATION;
> +		/*
> +		 * If cmpxchg fails with a value higher than the new_tsc, don't
> +		 * retry : the value has been incremented and the events
> +		 * happened almost at the same time.
> +		 * We must retry if cmpxchg fails with a lower value :
> +		 * it means that we are the CPU with highest frequency and
> +		 * therefore MUST update the value.
> +		 */
> +	} while (cmpxchg64(&ltt_last_tsc, last_tsc, new_tsc) < new_tsc);

(b) This is really quite expensive.

Why do things like this? Make the timestamps be per-cpu. If you do things 
like the above, then just getting the timestamp means that every single 
trace event will cause a cacheline bounce, and if you do that, you might 
as well just not have per-cpu tracing at all.

It really boils down to two cases:

 - you do per-CPU traces

   If so, you need to ONLY EVER touch per-cpu data when tracing, and the 
   above is a fundamental BUG. Dirtying shared cachelines makes the whole 
   per-cpu thing pointless.

 - you do global traces

   Sure, then the above works, but why bother? You'll get the ordering 
   from the global trace, you might as well do time stamps with local 
   counts.

So in neither case does it make any sense to try to do that global 
ltt_last_tsc.

Perhaps more importantly - if the TSC really are out of whack, that just 
means that now all your timestamps are worthless, because the value you 
calculate ends up having NOTHING to do with the timestamp. So you cannot 
even use it to see how long something took, because it may be that you're 
running on the CPU that runs behind, and all you ever see is the value of 
LTT_MIN_PROBE_DURATION.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ